I'm struggling to work out how the best way to store n+1 object in a solr document.

I am storing a CV/resume document in a solr document. I am looking at storing two different data types "education" and "employment"

If we look at education the object looks like this:

{
 "establishment" => 'Oxford',
 "Subject" => 'Computing',
 "Type" => 'Degree',
 "Grade" => '2:1'
}

A CV can have n+1 of these objects depending on the contents of the CV. The search needs to be able to see that when I search for CV with Establishment = Oxford & Subject = Computing & Grade = 2:1 that it matches this object not a different establishment with the same subject and grade.

A multivalue I don't think would help or is possible to store n+1 of these types of objects.

My question is how to set up solr to be able to store this type of data against one "CV" Solr document so that it is search able as part of a general search of the index?

Accepted Answer

You essentially want to turn Solr into a relational database. I.e. you want to enforce some structure on your documents rather than having them just be a bag of words.

If you need relations, then you need relations. The only way I can think of accomplishing this is to index education objects separately, and then have a "foreign key" from the resume.

Alternatively, it seems likely that your "n" will be pretty small. So you could just include each resume in the index multiple times, once with each education listing. This might throw off scoring a bit, but ymmv.

Written by Xodarap
This page was build to provide you fast access to the question and the direct accepted answer.
The content is written by members of the stackoverflow.com community.
It is licensed under cc-wiki