<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE TEI.2 PUBLIC "-//TEI P4//DTD Main Document Type//EN"
"http://www.infomotions.com/alex/dtd/tei2.dtd" [
<!ENTITY % TEI.XML         'INCLUDE' >
<!ENTITY % TEI.prose       'INCLUDE' >
<!ENTITY % TEI.linking     'INCLUDE' >
<!ENTITY % TEI.figures     'INCLUDE' >
<!ENTITY % TEI.names.dates 'INCLUDE' >
<!ATTLIST xptr   url CDATA #IMPLIED >
<!ATTLIST xref   url CDATA #IMPLIED >
<!ATTLIST figure url CDATA #IMPLIED >
]> 
<TEI.2>
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>What is love?</title> 
        <author>Eric Lease Morgan</author>
        <respStmt>
          <resp>converted into TEI-conformant markup by</resp>
          <name>Eric Lease Morgan</name>
        </respStmt>
      </titleStmt>
      <publicationStmt>
        <publisher>Eric Lease Morgan, &#169; University of Notre Dame</publisher>
        <address>
        	<addrLine>emorgan@nd.edu</addrLine>
        </address>
        <distributor>Available through the Distant Reader at <xptr url='https://distantreader.org/blog/what-is-love/' />.</distributor>
        <idno type='reader'>23</idno>
        <availability status='free'>
          <p>This document is distributed under a GNU Public License.</p>
        </availability>
      </publicationStmt>
      <notesStmt>
       <note type='abstract'>Addressing the timeless question, "What is love?"</note>
      </notesStmt>
      <sourceDesc>
        <p>This was originally posted to the Code4Lib Slack channel (October 24, 2022)</p>
      </sourceDesc>
    </fileDesc>
    <profileDesc>
      <creation>
        <date>2022-11-14</date>
      </creation>
      <textClass>
        <keywords>
          <list><item>readings</item></list>
        </keywords>
      </textClass>
    </profileDesc>
    <revisionDesc>
      <change>
<date>2022-11-14</date>
<respStmt>
<name>Eric Lease Morgan</name>
</respStmt>
<item>initial TEI encoding</item>
</change>
    </revisionDesc>
  </teiHeader>
  <text>
    <front>
    </front>
    <body>
      <div>
  <p>A graduate student, with whom I've been working, assembled the full text of 600 Victorian novels. I used the 
  <xref url='https://reader-toolbox.readthedocs.io'>Distant Reader Toolbox</xref>to model the corpus, and in the end, the corpus is close to 93 million words long. (By comparison, the Bible is about .8 million words long.)</p>
  <p>I then applied concordancing to the corpus to answer the question, "What is love?", and the some of the snippets, below, were returned.</p>
  <list type='bulleted'>
    <item>a beautiful thing." "i can not say. not experienced in beau</item>
    <item>a horror to every pure mind; it was to the minister the mos</item>
    <item>a pathological condition. i am painfully aware of the objec</item>
    <item>cheated every day in this way by offenders much more seriou</item>
    <item>indestructible! its holy flame for ever burneth-from heaven</item>
    <item>not so necessary to us women as people think. fine writers</item>
    <item>so splendid, it is the greatest pity it should be impossibl</item>
    <item>surely the early home of the heart. it came upon him so ple</item>
    <item>the greatest of all hypocrites." "perhaps that is true," sa</item>
    <item>the soul of an irish dragoon!' by jove, i am as delighted t</item>
    <item>worth the whole broad earth; give that, you give us all!" "</item>
  </list>
  <p>(The complete list is linked at 
  <xref url='./love-is.txt'>./love-is.txt</xref>.)</p>
  <p>A colleague (Ben Companjen) then asked, "I [wonder] if certain literal expressions were more common, not necessarily if there was a classification?"</p>
  <p>I then said to myself, "Literal expressions? Such are ngrams, and the <xref url='./ngrams.py'>attached Python script</xref>
 outputs a frequency list of ngrams from a configured file." I applied the script to the definitions, and some of the more interesting results include the following phrases and their frequencies:</p>
  <list type='ordered'>
    <item>a man (22)</item>
    <item>the world (16)</item>
    <item>stronger than (15)</item>
    <item>the heart (11)</item>
    <item>a thing (10)</item>
    <item>a woman (9)</item>
    <item>not love (9)</item>
    <item>more than (8)</item>
    <item>a passion (8)</item>
    <item>the first (6)</item>
    <item>my heart (6)</item>
    <item>blind and (4)</item>
    <item>creative energy (2)</item>
    <item>one god (2)</item>
    <item>his religion (2)</item>
    <item>immortal and (2)</item>
  </list>
  <p>Fun with text mining, natural language processing, and data science.</p>
</div>

    </body>
    <back>
    </back>
  </text>
</TEI.2>
