<?xml version="1.0" encoding="UTF-8" ?>
<!DOCTYPE TEI.2 PUBLIC "-//TEI P4//DTD Main Document Type//EN"
"http://www.infomotions.com/alex/dtd/tei2.dtd" [
<!ENTITY % TEI.XML         'INCLUDE' >
<!ENTITY % TEI.prose       'INCLUDE' >
<!ENTITY % TEI.linking     'INCLUDE' >
<!ENTITY % TEI.figures     'INCLUDE' >
<!ENTITY % TEI.names.dates 'INCLUDE' >
<!ATTLIST xptr   url CDATA #IMPLIED >
<!ATTLIST xref   url CDATA #IMPLIED >
<!ATTLIST figure url CDATA #IMPLIED >
]> 
<TEI.2>
  <teiHeader>
    <fileDesc>
      <titleStmt>
        <title>Fun with high-performance computing</title> 
        <author>Eric Lease Morgan</author>
        <respStmt>
          <resp>converted into TEI-conformant markup by</resp>
          <name>Eric Lease Morgan</name>
        </respStmt>
      </titleStmt>
      <publicationStmt>
        <publisher>Eric Lease Morgan, &#169; University of Notre Dame</publisher>
        <address>
        	<addrLine>emorgan@nd.edu</addrLine>
        </address>
        <distributor>Available through the Distant Reader at <xptr url='https://distantreader.org/blog/fun-with-hpc/' />.</distributor>
        <idno type='reader'>55</idno>
        <availability status='free'>
          <p>This document is distributed under a GNU Public License.</p>
        </availability>
      </publicationStmt>
      <notesStmt>
       <note type='abstract'>I am having fun with high performance computing (HPC) and Slurm.</note>
      </notesStmt>
      <sourceDesc>
        <p>This content was originally shared on a Slack channel</p>
      </sourceDesc>
    </fileDesc>
    <profileDesc>
      <creation>
        <date>2024-04-30</date>
      </creation>
      <textClass>
        <keywords>
          <list><item>High-Performance Computing</item><item>Amazon Web Services</item></list>
        </keywords>
      </textClass>
    </profileDesc>
    <revisionDesc>
      <change>
<date>2024-04-30</date>
<respStmt>
<name>Eric Lease Morgan</name>
</respStmt>
<item>initial TEI encoding</item>
</change>
    </revisionDesc>
  </teiHeader>
  <text>
    <front>
    </front>
    <body>
      <div1><p>I am having fun with high performance computing (HPC) and Slurm.</p>

<p>Using a Python library (<xref url='https://docs.aws.amazon.com/parallelcluster/latest/ug/pcluster-v3.html'>pcluster</xref>) on Amazon Web Services I have spun up an HPC with one head node and 32 worker nodes. (See image.) Each worker node has 32 cores and about 256 GB of RAM. That means I have 1024 (32 x 32) cores as my disposal.</p>

<p  rend='center'><figure url='./squeue-small.png' /><lb /><xref url='./squeue.png'>squeue - a list of jobs</xref></p>

<p>I then created a list of jobs measuring 710 items long. Each job takes a set of journal articles, does natural language processing against them, and outputs a data set -- a "study carrel". Each job also does a bit of modeling against the carrel. There are about 10's of thousands of articles in these jobs.</p>

<p>I submit the jobs to the cluster using the attached Slurm sbatch file, which is really a <xref url='build-journals.sh'>glorified shell script</xref>. Nodes are spun up, and the work continues until all the jobs are completed. I can watch the process of jobs by monitoring the queue as well as a few log files. From the second attachment you can see there are thirty-two nodes working. Some have been working for only a few minutes, while others have been working for more than an hour. Some of the jobs only have a few dozen articles to process; some of the jobs have hundreds of articles. Moreover, about 125 jobs have been completed. </p>

<p>Processing things in parallel is a very powerful computing technique, and when one has access to multiple computers or one very big computer, parallel processing can make long computations very short. For example, if I were not using parallel processing against my study carrels, then the processing would take weeks, but in this case, it only takes a few hours. </p>

<p>Fun with HPCs.</p>
</div1>



    </body>
    <back>
    </back>
  </text>
</TEI.2>
