Getting started with LinkedBrainz D2R mappings

  • user warning: Unknown column 'captcha_type' in 'field list' query: SELECT module, captcha_type FROM captcha_points WHERE form_id = 'comment_form' in /var/www/drupal/sites/all/modules/captcha/captcha.inc on line 64.
  • user warning: Unknown column 'captcha_type' in 'field list' query: SELECT module, captcha_type FROM captcha_points WHERE form_id = 'user_login_block' in /var/www/drupal/sites/all/modules/captcha/captcha.inc on line 64.

For the benefit of any interested parties, here is a brief guide to getting a LinkedBrainz D2R server up and running.

Get the data

Get a MusicBrainz NGS Postgres database up and running.  The dumps can be found here - just grab the latest one.  You should only need the mbdump.tar.bz2 file.  However, importing the data requires a copy of the MusicBrainz NGS server as well.  Follow the README file - it is a moderately involved process setting up the server.  If you have problems, the MusicBrainz IRC channel is a good place to get help.

Run the mapping

Get a copy of the LinkedBrainz D2R fork which includes the D2RQ mapping file. This should be fairly self-contained in terms of dependencies - all the byte-compiled java bits are included but of course you will need a JVM of some kind. 

The database credentials in the musicbrainz_mapping.n3 file will need to be altered to match your own database settings.

# connection stuff
	map:database a d2rq:Database;
	    d2rq:jdbcDriver "org.postgresql.Driver";
	    d2rq:jdbcDSN "jdbc:postgresql://localhost/musicbrainz_db";
	    d2rq:username "kurtjx";
	    d2rq:password "kurtjx";
	    # relatively crude to limit db load
	    d2rq:resultSizeLimit 100;
	    .

It should only be neccissary to change the d2rq:username and d2rq:password predicates but you may need to change the "musicbrainz_db" suffix if you somehow renamed the database.

Now running the SPARQL endpoint is done by

 ./d2r-server musicbrainz_mapping.n3

Hopefully now unhandled exceptions will appear and the server will be online.  You can access the Snorql interface at http://localhost:2020/snorql/.

Hacking the mapping

The mapping of the MusicBrainz relational tables to RDF all occurs in the magical musicbrainz_mapping.n3 file.  Unfortunately, modularizing this file is not currently supported and it's quite large.  You will want to be familiar with the Turtle RDF syntax if you are not already and the D2RQ mapping language.

Of course, you will need to become familiar with the MusicBrainz Next Generation Schema.  The NGS diagram will be a very helpful reference.  One hint, the direction of d2rq:join literals (e.g. "<=") generally (maybe always?) should follow the direction of the arrows between tables in the NGS diagram.

The mappings are not complete and there are a few know issues with the mappings that are in place.  As additional issues arise, it is probably wise to continue to use the github issue cue to collect and track problems.  Happy mapping!!!

Please note that a more

Please note that a more up-to-date fork is now available at https://github.com/BarryNorton/D2R-LinkedBrainz-Fork

Otherwise please follow these instructions

Note that the 'editors' dump

Note that the 'editors' dump is needed too or you get a foreign key violation:
http://tickets.musicbrainz.org/browse/MBS-987

Hi Folks Some more useful

Hi Folks

Some more useful stuff here from our goodselves. I wonder if your experiences with implementing D2R would provide useful lessons for the other JiscEXPO projects and the LD community generally? Might be worth a presentation somehere?

Cheers, Adrian
JiscEXPO Liaison

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options