<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE TEI.2 PUBLIC "-//TEI P4//DTD Main Document Type//EN"
"http://www.infomotions.com/alex/dtd/tei2.dtd" [
<!ENTITY % TEI.XML         'INCLUDE' >
<!ENTITY % TEI.prose       'INCLUDE' >
<!ENTITY % TEI.linking     'INCLUDE' >
<!ENTITY % TEI.figures     'INCLUDE' >
<!ENTITY % TEI.names.dates 'INCLUDE' >
<!ATTLIST xptr   url CDATA #IMPLIED >
<!ATTLIST xref   url CDATA #IMPLIED >
<!ATTLIST figure url CDATA #IMPLIED >
]> 
<TEI.2>
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>Study Carrels and Content Negotiation</title> 
        <author>Eric Lease Morgan</author>
        <respStmt>
          <resp>converted into TEI-conformant markup by</resp>
          <name>Eric Lease Morgan</name>
        </respStmt>
      </titleStmt>
      <publicationStmt>
        <publisher>Eric Lease Morgan, &#169; University of Notre Dame</publisher>
        <address>
        	<addrLine>emorgan@nd.edu</addrLine>
        </address>
        <distributor>Available through the Distant Reader at <xptr url='https://distantreader.org/blog/study-carrels-and-content-negotiation/' />.</distributor>
        <idno type='reader'>58</idno>
        <availability status='free'>
          <p>This document is distributed under a GNU Public License.</p>
        </availability>
      </publicationStmt>
      <notesStmt>
       <note type='abstract'>I have begun to create a collection of data sets I call "study carrels", and the collection exploits content negotiation -- an HTTP application programmer interface.</note>
      </notesStmt>
      <sourceDesc>
        <p>I believe I originally shared this on the Code4Lib mailing list.</p>
      </sourceDesc>
    </fileDesc>
    <profileDesc>
      <creation>
        <date>2024-04-30</date>
      </creation>
      <textClass>
        <keywords>
          <list><item>study carrels</item></list>
        </keywords>
      </textClass>
    </profileDesc>
    <revisionDesc>
      <change>
<date>2024-04-30</date>
<respStmt>
<name>Eric Lease Morgan</name>
</respStmt>
<item>initial TEI encoding</item>
</change>
    </revisionDesc>
  </teiHeader>
  <text>
    <front>
    </front>
    <body>
      <div1><p>I have begun to create a collection of data sets I call "study carrels". See: <xref url='http://carrels.distantreader.org/'>http://carrels.distantreader.org/</xref></p>

<p>One of the goals of the collection is to make sure the collection is machine-readable, and to accomplish that goal, it exploits an API native to HTTP, namely, content negotiation. Given a MIME-type and a URI, the HTTP server returns a presentation of the URI in the given type. The server has been configured to know of a number of different types and each type is associated with a different presentation. For example, given the MIME-type application/xhtml+xml, the server will return an XHTML file which is really a bibliography. Given a MIME-type of text/html, the server will return a report describing the data set. Given a MIME-type of application/rdf+xml, the server will return a Linked Data representation.</p>

<p>For example:</p>

<p rend='pre'>curl -L -H "Accept: text/html" \
http://carrels.distantreader.org/subject-americanWitAndHumorPictorial-gutenberg</p>

<p>Or</p>

<p rend='pre'>curl -L -H "Accept: application/xhtml+xml" \
http://carrels.distantreader.org/subject-americanWitAndHumorPictorial-gutenberg</p>

<p>I have written three short Python scripts demonstrating how content negotiation can be exploited. The first visualizes the content of the study carrel as a network graph. More specifically, the MIME-type of application/gml is sent to the server, a Graph Markup Language file is returned, and finally the file is rendered as an image. See <xref url='./carrel2graph.py'>carrel2graph.py</xref>.</p>

<p rend='center'><figure url='./network-graph-small.jpg'/><lb />
<xref url='./network-graph.jpg'>visualization of a study carrel in the form of a network graph</xref></p>

<p>As a second example, the MIME-type of application/json is specified, and the script ultimately returns a tab-delimited stream containing authors, titles, and URLs of the items in the data set. This stream can be saved to a file and imported into a spreadsheet program for more detailed analysis. See <xref url='./carrel2tsv.py'>carrel2tsv.py</xref>.</p>

<p>The third script -- <xref url='./list-rdf.py'>list-rdf.py</xref> -- returns a list of URIs pointing to Linked Data representations of each study carrel in the collection. These URIs and all their content could then be imported into and RDF triple store for the purposes of supporting the Semantic Web.</p>

<p>Here's the point. As a librarian, I desire to create collections and provide services against them. My collection is a set of data sets -- consistently structured amalgamations of information. By putting these collections on the Web and taking advantages of HTTP's native APIs, I do not have to write special software that will eventually break. Moreover, robots can crawl the collection and people can write interesting applications against it. Thus, I support services. As long as the server is running, the collection will be usable. No broken links. No software to be upgraded. Very little maintenance required. 'Sounds sustainable to me. </p>

<p>Note to self: Implement a call to the root of the domain which returns a list of all the collection's identifiers -- URIs</p>

<p>P.S. For a good time, try this one to download a study carrel:</p>

<p rend='pre'>curl -L -H "Accept: application/zip" \
http://carrels.distantreader.org/subject-americanWitAndHumorPictorial-gutenberg &gt; index.zip</p>

</div1>

    </body>
    <back>
    </back>
  </text>
</TEI.2>
