Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Yes -- but not with that module. We're using it to tag documents (individual rows in Postgres and Solr) with user-defined metadata.

http://blog.documentcloud.org/blog/2011/05/arbitrary-metadat...

... which can then be used to power custom indexes of particular collections of documents, like this one:

http://www.nytimes.com/interactive/2011/12/02/us/oil-and-gas...

The fun bit being, that in both Postgres and Solr, you can still run a precise search (including joins for access control, sharing, and boolean full text clauses) with a single query because the HStore keys and values are indexed.



How did you build it? Are you using hstore SQL directly? Do you just have migrations with raw SQL in them?


Let's dig through it a bit. Our hstore column isn't directly on the document row, but rather in a joined "docdata" table. There was a semi-raw migration to add it:

https://github.com/documentcloud/documentcloud/blob/master/d...

The "Docdata" model handles serializing JSON hashes of keys -> string values into the hstore format, and parsing 'em back out. This is a little loose, but works for most of the simple things you'd like for tagging documents (senator: Kerry, statecode: WV, recordid: 019281)

https://github.com/documentcloud/documentcloud/blob/master/a...

... the resulting effect is that from the Rails side, working with "document.data" directly is like working with a lazy-loaded hash.

The more interesting bit is the search side of things, here's the query parser and query object (which generates SQL and Solr queries):

https://github.com/documentcloud/documentcloud/blob/master/l...

https://github.com/documentcloud/documentcloud/blob/master/l...

... for any given query, we detect if Solr is required (if full-text is involved), or if the search can be satisfied purely in Postgres. So there's one way of generating the key:value query for each back end.

Hope that answers the question.

(Edit:) Forgot to mention that we add a little extra on top of vanilla hstore "key:value" queries. You can write "dob: *" to show documents with any senator tagged, and "dob: !" to show documents that haven't been tagged with a date of birth yet.


This is awesome, thank you (didn't think to check if you were in Github, but the explanations are great too).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: