Models
Model/Resource Documentation¶
The following describes the tables/models comprising our relational database. Each model features a short description noting its purpose and key fields. A table listing field names, datatypes, and a short description of each field is also included. The Dockets
, PublicComments
, and Documents
tables mimic the regulations.gov tables, while the NLP Output
table is our data generated based on the regulations.gov data tables that is used to populate the website. Where adequate NLP output can't be generated due to data sparsity, information from PublicComments
and Documents
is used.
Dockets
Dockets
represents collections of documents relevant to a proposed rule or notice. A given docket can contain documents available for commenting that represent the proposed rule change along with supplementary documents, such as a cost-benefit analysis, that support the proposed rule.
Since the dockets themselves are unavailable to comment on (only the documents contained within can recieve comments), we chose to collect only enough information to link docments to a docket (via the id
field ) and other basic information (date, posting agency).
Name | Type | Description |
---|---|---|
id | PrimaryKey (Dockets) | Regulations.gov UUID for a docket |
docketType | CharField | Type of docket (i.e. Rulemaking, Nonrulemaking) |
lastModifiedDate | DateTime | Date docket was last updated |
agencyID | CharField | ID of agency who posted docket |
objectID | CharField | Regulations.gov UUID for API response object |
PublicComments
PublicComments
represents an individual comment posted to a document. Each comment features its own UUID (id
) and is linked to its corresponding document by the document
id (also links this table to the Documents
table). Each comment is represented by the text of the comment, along with any available information collected by Regulations.gov on the comment. This data describes both the comment, such as whether the comment was withdrawn by the user, if the commented was posted to a restricted document (only open to certain agencies or interest groups), or the number of comments that feature the same text.
This table also stores data on individuals who posted a comment, including their name, location, and organization.
Name | Type | Description |
---|---|---|
id | PrimaryKey (Comment) | Regulations.gov UUID for a document |
commentOn | CharField | Document the commnent is posted to |
document | ForeignKey (Documents) | Document ID a comment is posted to |
duplicateComments | IntegerField | Number of duplicate comments |
stateProvinceRegion | CharField | State or province a comment is posted from |
subtype | CharField | Classifier for source of comment (e.g Member of Congress, Mass Mail Campaign) |
objectId | CharField | Regulations.gov UUID for API response object |
comment | TextField | Text of comment |
firstName | CharField | First name of commenter |
lastName | CharField | Last name of commenter |
address1 | CharField | First line of commenter's address |
address2 | CharField | Second line of commenter's address |
city | CharField | City of the commenter's address |
country | CharField | Commenter's country |
EmailField | Email address of commenter | |
phone | CharField | Phone number of commenter |
govAgency | CharField | Agency receiving comments |
govAgencyType | CharField | Type of agency receiving comments |
organization | CharField | Commenter's organization |
originalDocumentId | CharField | Regulations.gov document ID |
modifyDate | DateTime | Date the comment was last modified |
pageCount | IntegerField | Number of pages for the comment |
postedDate | DateTime | Date the comment was inital posted |
receiveDate | DateTime | Date comment was recieved by posting agency |
title | CharField | Title of the commenter |
withdrawn | Boolean | Bool if the comment was withdrawn |
reasonWithdrawn | CharField | User submitted reason a comment was withdrawn |
zip | charField | Zip code of the commenter |
restrictReasonType | CharField | If document has been restricted for comment to certain users |
restrictReason | CharField | Summary of reason for comment restriction |
Documents
Documents
stores metadata on documents that can have comments posted to. To minimize our data intake, CivicLens only collects documents labeled as open-for-comment by the federal government. This does not mean that every document we collect features posted comments, but that some subset of the public is able to comment on each document. Each document also stores a URL to the Federal Register's XML version of the full document text (fullTextXmlUrl
). This is used to extract the full text of the proposed rule which is not available through regulations.gov. summary
contains the plain English summary of the rule written by the Federal government which will be available on our site under the /documents
endpoint.
Name | Type | Description |
---|---|---|
id | PrimaryKey (Documents) | Regulations.gov UUID for a document |
documentType | CharField | Type of document (i.e. Proposed Rule, Notice) |
lastModifiedDate | DateTime | Date document was last updated |
withdrawn | BooleanField | Boolean if the documentment was withdrawn |
agencyID | CharField | ID of agency who posted document |
commentEndDate | DateTime | End date of document commenting period |
commentStartDate | DateTime | Start date of document commenting period |
objectId | CharField | Regulations.gov UUID for API response object |
fullTextXmlUrl | URL | Link to Federal Registar's XML text of rule |
subAgy | CharField | Relevant office or department of posting agency |
agencyType | CharField | Name of posting agency (e.g. FDA) |
CRF | CharField | Code of Federal Regulations number |
RIN | CharField | Regulation Identification Number used by Federal Register |
title | CharField | Regulations.gov document title |
summary | TextField | Plain English summary of document |
furtherInformation | TextField | Additional information provided with document |
NLP Output
The NLP Output
table stores the information generated by the NLP pipeline that is run by our website. There is one row for each document in the database which contains information on representative comments, new titles, sentiment, and more. This data is used to populate the search
and document
pages on the website with relevant information about a document.
Name | Type | Description |
---|---|---|
document_id | OneToOneField (Documents) | Regulations.gov UUID for a document |
comments | JSONField | Representative comments that can be form letters or unique comments |
doc_plain_english_title | CharField | AI generated simple title for a document |
num_total_comments | IntegerField | Number of all comments on a document |
num_unique_comments | IntegerField | Number of unique submissions including types of form letters |
num_representative_comment | IntegerField | Number of representative comments |
topics | JSONField | NLP generated topics for the document from the representative comments |
num_topics | IntegerField | Number of topics identified |
last_updated | DateTimeField | Last time the NLP was updated for the document |
created_at | DateTimeField | Time the document was created |
search_topics | TextField | NLP generated topics for the django search to reference |
is_representative | BooleanField | Does the document have representative comments |