{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Text tokenization\n", "\n", "This section contains code that will tokenize the transcription data and add new columns to the data frames for each transcription dataset.\n", "\n", "First, we run the definitions step from the previous section." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "%run 02_definitions.ipynb" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load each transcription dataset into a data frame\n", "The `load_csv` function will read the data from each path constant and store data in a Pandas data frame." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "a = load_csv(ANTHONY)\n", "c = load_csv(CATT)\n", "s = load_csv(STANTON)\n", "t = load_csv(TERRELL)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Optional: Preview the first five lines of a loaded dataset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### First five lines for Anthony dataset" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | Campaign | \n", "Project | \n", "Item | \n", "ItemId | \n", "Asset | \n", "AssetId | \n", "AssetStatus | \n", "DownloadUrl | \n", "Transcription | \n", "Tags | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-1 | \n", "179295 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Susan B. Anthony SPEECHES AND WRITINGS FI... | \n", "May 1852 | \n", "
1 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-2 | \n", "179296 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "/52\\r\\nS.B.A-\\r\\n\\r\\nDelivered for the\\r\\nFirs... | \n", "NaN | \n", "
2 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-3 | \n", "179297 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "will the best & wisest of mothers continue\\r\\n... | \n", "temperance | \n", "
3 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-4 | \n", "179298 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "[Mind] the youthful mind. Of how\\r\\nlittle av... | \n", "temperance | \n", "
4 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-5 | \n", "179299 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "x\\r\\nWhile we labor to reclaim one generation ... | \n", "temperance | \n", "
\n", " | Campaign | \n", "Project | \n", "Item | \n", "ItemId | \n", "Asset | \n", "AssetId | \n", "AssetStatus | \n", "DownloadUrl | \n", "Transcription | \n", "Tags | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040385 | \n", "mss154040385-1 | \n", "189284 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "CATT, Carrie Chapman\\r\\nSPEECH, ARTICLE, BOOK ... | \n", "NaN | \n", "
1 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040385 | \n", "mss154040385-2 | \n", "189285 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "-2-\\r\\nWe appeal in the name of our foremother... | \n", "NaN | \n", "
2 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040385 | \n", "mss154040385-3 | \n", "189286 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "AN APPEAL FOR LIBERTY. 1915\\r\\n\\r\\nBy Carri... | \n", "NaN | \n", "
3 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040386 | \n", "mss154040386-1 | \n", "189287 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "CATT, Carrie Chapman\\r\\nSPEECH, ARTICLE, BOOK ... | \n", "NaN | \n", "
4 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040386 | \n", "mss154040386-2 | \n", "189288 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "The \\r\\nWoman Citizen\\r\\nA WEEKLY CHRONICLE OF... | \n", "NaN | \n", "
\n", " | Campaign | \n", "Project | \n", "Item | \n", "ItemId | \n", "Asset | \n", "AssetId | \n", "AssetStatus | \n", "DownloadUrl | \n", "Transcription | \n", "Tags | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-1 | \n", "179712 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Elizabeth Cady Stanton GENERAL CORRESPONDENCE... | \n", "NaN | \n", "
1 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-2 | \n", "179713 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "The following four letters are \\r\\nfrom Daniel... | \n", "Peter Smith; Daniel Cady; Judge Cady | \n", "
2 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-3 | \n", "179714 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "22 ... | \n", "NaN | \n", "
3 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-4 | \n", "179715 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "he could to make her respectable & happy. That... | \n", "Peter Smith; Bonaparte | \n", "
4 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-5 | \n", "179716 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Johnstown 2 D Paid 10\\r\\n\\r\\n\\r\\nPeter Smi... | \n", "Peter Smith | \n", "
\n", " | Campaign | \n", "Project | \n", "Item | \n", "ItemId | \n", "Asset | \n", "AssetId | \n", "AssetStatus | \n", "DownloadUrl | \n", "Transcription | \n", "Tags | \n", "
---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-1 | \n", "7580 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Office Supplies typewriter ribbons fountain pe... | \n", "Mrs Ella Wheeler Wilcox; Woman Suffrage Conven... | \n", "
1 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-2 | \n", "7581 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "March 16, Wednesday,1904 - Dr. Booker Washingt... | \n", "Cruger; Calloway; VanRensselaer; Booker; Washi... | \n", "
2 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-3 | \n", "7582 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Fountain Pens Repaired\\r\\nTablets\\r\\nTypewrite... | \n", "Pennsylvania; committee; Washington Post | \n", "
3 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-4 | \n", "7583 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "May, 1904\\r\\n\\r\\n1 SUNDAY Received invitation ... | \n", "NaN | \n", "
4 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-5 | \n", "7584 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "June, 1904\\r\\n\\r\\n7 TUESDAY Reached Bremer Hav... | \n", "Berlin; Congress morning; June 1904; Paris | \n", "
\n", " | Text | \n", "Entity | \n", "
---|---|---|
0 | \n", "Susan B. Anthony SPEECHES | \n", "PERSON | \n", "
1 | \n", "WRITINGS FILE Delivered | \n", "ORG | \n", "
2 | \n", "first | \n", "ORDINAL | \n", "
3 | \n", "Batavia | \n", "GPE | \n", "
4 | \n", "2 | \n", "CARDINAL | \n", "
5 | \n", "May 1852 | \n", "DATE | \n", "
6 | \n", "1852 | \n", "DATE | \n", "
\n", " | Text | \n", "Entity | \n", "
---|---|---|
0 | \n", "CATT | \n", "PERSON | \n", "
1 | \n", "Carrie Chapman\\r\\nSPEECH | \n", "PERSON | \n", "
2 | \n", "ARTICLE, BOOK FILE\\r\\nSpeech | \n", "LAW | \n", "
\n", " | Text | \n", "Entity | \n", "
---|---|---|
0 | \n", "Elizabeth Cady Stanton | \n", "PERSON | \n", "
1 | \n", "1814 - 49 | \n", "DATE | \n", "
\n", " | Text | \n", "Entity | \n", "
---|---|---|
0 | \n", "Swett | \n", "PERSON | \n", "
1 | \n", "Stationery Blank Books | \n", "ORG | \n", "
2 | \n", "P Swett | \n", "PERSON | \n", "
3 | \n", "February, 1904 | \n", "DATE | \n", "
4 | \n", "178 | \n", "CARDINAL | \n", "
5 | \n", "Monday\\r\\n2 Tuesday\\r\\n3 | \n", "DATE | \n", "
6 | \n", "Wednesday\\r\\n4 | \n", "DATE | \n", "
7 | \n", "Thursday \\r\\n5 Friday\\r\\n6 | \n", "DATE | \n", "
8 | \n", "Crandall Association | \n", "ORG | \n", "
9 | \n", "7:30\\r\\nSpecial | \n", "TIME | \n", "
\n", " | Campaign | \n", "Project | \n", "Item | \n", "ItemId | \n", "Asset | \n", "AssetId | \n", "AssetStatus | \n", "DownloadUrl | \n", "Transcription | \n", "Tags | \n", "tokenized_text | \n", "entities | \n", "text | \n", "stop_words | \n", "nonalphanums | \n", "numbers | \n", "ambigs | \n", "processed_text | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-1 | \n", "179295 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Susan B. Anthony SPEECHES AND WRITINGS FI... | \n", "May 1852 | \n", "[(Susan, Susan, PROPN, NNP, compound, Xxxxx, T... | \n", "[(Susan B. Anthony SPEECHES, 0, 30, PERSO... | \n", "[(Susan, Susan, PROPN, NNP, compound, Xxxxx, T... | \n", "[(AND, and, CCONJ, CC, cc, XXX, True, True), (... | \n", "[( , , SPACE, _SP, dep, , False, ... | \n", "[(2, 2, NUM, CD, nummod, d, False, False), (18... | \n", "[] | \n", "[susan, b., anthony, speeches, writing, file, ... | \n", "
1 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-2 | \n", "179296 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "/52\\r\\nS.B.A-\\r\\n\\r\\nDelivered for the\\r\\nFirs... | \n", "NaN | \n", "[(/52, /52, PROPN, NNP, punct, /dd, False, Fal... | \n", "[(Batavia, 44, 51, GPE), (N.J., 52, 56, GPE), ... | \n", "[(/52, /52, PROPN, NNP, punct, /dd, False, Fal... | \n", "[(for, for, ADP, IN, prep, xxx, True, True), (... | \n", "[(\\r\\n, \\r\\n, SPACE, _SP, dep, \\r\\n, False, Fa... | \n", "[(1852, 1852, NUM, CD, nummod, dddd, False, Fa... | \n", "[] | \n", "[/52, s.b.a-, deliver, batavia, n.j., company,... | \n", "
2 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-3 | \n", "179297 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "will the best & wisest of mothers continue\\r\\n... | \n", "temperance | \n", "[(will, will, AUX, MD, aux, xxxx, True, True),... | \n", "[(the\\r\\nSociety, 295, 307, ORG), (two, 324, 3... | \n", "[(best, good, ADJ, JJS, nsubj, xxxx, True, Fal... | \n", "[(will, will, AUX, MD, aux, xxxx, True, True),... | \n", "[(&, &, CCONJ, CC, cc, &, False, False), (\\r\\n... | \n", "[] | \n", "[] | \n", "[good, wise, mother, continue, son, fall, vict... | \n", "
3 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-4 | \n", "179298 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "[Mind] the youthful mind. Of how\\r\\nlittle av... | \n", "temperance | \n", "[([, [, X, XX, dep, [, False, False), (Mind, m... | \n", "[(christian, 77, 86, NORP), (truth & sobernes... | \n", "[(Mind, mind, VERB, VB, dep, Xxxx, True, False... | \n", "[(the, the, DET, DT, det, xxx, True, True), (O... | \n", "[([, [, X, XX, dep, [, False, False), (], ], X... | \n", "[] | \n", "[] | \n", "[mind, youthful, mind, little, avail, untire, ... | \n", "
4 | \n", "Susan B. Anthony Papers | \n", "Speeches and other writings | \n", "Susan B. Anthony Papers: Speeches and Writings... | \n", "mss11049038 | \n", "mss11049038-5 | \n", "179299 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "x\\r\\nWhile we labor to reclaim one generation ... | \n", "temperance | \n", "[(x, x, ADP, IN, punct, x, True, False), (\\r\\n... | \n", "[(one, 29, 32, CARDINAL), (Legislature, 145, 1... | \n", "[(x, x, ADP, IN, punct, x, True, False), (labo... | \n", "[(While, while, SCONJ, IN, mark, Xxxxx, True, ... | \n", "[(\\r\\n, \\r\\n, SPACE, _SP, dep, \\r\\n, False, Fa... | \n", "[] | \n", "[] | \n", "[x, labor, reclaim, generation, drunkard, rise... | \n", "
\n", " | Campaign | \n", "Project | \n", "Item | \n", "ItemId | \n", "Asset | \n", "AssetId | \n", "AssetStatus | \n", "DownloadUrl | \n", "Transcription | \n", "Tags | \n", "tokenized_text | \n", "entities | \n", "text | \n", "stop_words | \n", "nonalphanums | \n", "numbers | \n", "ambigs | \n", "processed_text | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040385 | \n", "mss154040385-1 | \n", "189284 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "CATT, Carrie Chapman\\r\\nSPEECH, ARTICLE, BOOK ... | \n", "NaN | \n", "[(CATT, CATT, PROPN, NNP, ROOT, XXXX, True, Fa... | \n", "[(CATT, 0, 4, PERSON), (Carrie Chapman\\r\\nSPEE... | \n", "[(CATT, CATT, PROPN, NNP, ROOT, XXXX, True, Fa... | \n", "[(An, an, DET, DT, det, Xx, True, True), (For,... | \n", "[(,, ,, PUNCT, ,, punct, ,, False, False), (\\r... | \n", "[] | \n", "[] | \n", "[catt, carrie, chapman, speech, article, book,... | \n", "
1 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040385 | \n", "mss154040385-2 | \n", "189285 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "-2-\\r\\nWe appeal in the name of our foremother... | \n", "NaN | \n", "[(-2-, -2-, PUNCT, ``, punct, -d-, False, Fals... | \n", "[(-2-\\r\\n, 0, 5, PERSON), (American, 428, 436,... | \n", "[(appeal, appeal, VERB, VBP, ccomp, xxxx, True... | \n", "[(We, we, PRON, PRP, nsubj, Xx, True, True), (... | \n", "[(-2-, -2-, PUNCT, ``, punct, -d-, False, Fals... | \n", "[(1,600,000, 1,600,000, NUM, CD, nummod, d,ddd... | \n", "[] | \n", "[appeal, foremother, forefather, equal, courag... | \n", "
2 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040385 | \n", "mss154040385-3 | \n", "189286 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "AN APPEAL FOR LIBERTY. 1915\\r\\n\\r\\nBy Carri... | \n", "NaN | \n", "[(AN, an, DET, DT, det, XX, True, True), (APPE... | \n", "[(1915, 26, 30, DATE), (Carrie Chapman Catt, 3... | \n", "[(APPEAL, APPEAL, PROPN, NNP, ROOT, XXXX, True... | \n", "[(AN, an, DET, DT, det, XX, True, True), (FOR,... | \n", "[(., ., PUNCT, ., punct, ., False, False), ( ... | \n", "[(1915, 1915, NUM, CD, ROOT, dddd, False, Fals... | \n", "[] | \n", "[appeal, liberty, carrie, chapman, catt, year,... | \n", "
3 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040386 | \n", "mss154040386-1 | \n", "189287 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "CATT, Carrie Chapman\\r\\nSPEECH, ARTICLE, BOOK ... | \n", "NaN | \n", "[(CATT, CATT, PROPN, NNP, ROOT, XXXX, True, Fa... | \n", "[(CATT, 0, 4, PERSON), (Carrie Chapman\\r\\nSPEE... | \n", "[(CATT, CATT, PROPN, NNP, ROOT, XXXX, True, Fa... | \n", "[(Be, be, AUX, VB, ROOT, Xx, True, True)] | \n", "[(,, ,, PUNCT, ,, punct, ,, False, False), (\\r... | \n", "[] | \n", "[] | \n", "[catt, carrie, chapman, speech, article, book,... | \n", "
4 | \n", "Carrie Chapman Catt Papers | \n", "Speeches and articles | \n", "Carrie Chapman Catt Papers: Speech and Article... | \n", "mss154040386 | \n", "mss154040386-2 | \n", "189288 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "The \\r\\nWoman Citizen\\r\\nA WEEKLY CHRONICLE OF... | \n", "NaN | \n", "[(The, the, DET, DT, det, Xxx, True, True), (\\... | \n", "[(Carrie Chapman Catt, 156, 175, PERSON), (Con... | \n", "[(Woman, Woman, PROPN, NNP, compound, Xxxxx, T... | \n", "[(The, the, DET, DT, det, Xxx, True, True), (A... | \n", "[(\\r\\n, \\r\\n, SPACE, _SP, dep, \\r\\n, False, Fa... | \n", "[(21, 21, NUM, CD, nummod, dd, False, False), ... | \n", "[] | \n", "[woman, citizen, weekly, chronicle, progress, ... | \n", "
\n", " | Campaign | \n", "Project | \n", "Item | \n", "ItemId | \n", "Asset | \n", "AssetId | \n", "AssetStatus | \n", "DownloadUrl | \n", "Transcription | \n", "Tags | \n", "tokenized_text | \n", "entities | \n", "text | \n", "stop_words | \n", "nonalphanums | \n", "numbers | \n", "ambigs | \n", "processed_text | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-1 | \n", "179712 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Elizabeth Cady Stanton GENERAL CORRESPONDENCE... | \n", "NaN | \n", "[(Elizabeth, Elizabeth, PROPN, NNP, compound, ... | \n", "[(Elizabeth Cady Stanton, 0, 22, PERSON), (181... | \n", "[(Elizabeth, Elizabeth, PROPN, NNP, compound, ... | \n", "[] | \n", "[( , , SPACE, _SP, dep, , False, False), (-,... | \n", "[(1814, 1814, NUM, CD, appos, dddd, False, Fal... | \n", "[] | \n", "[elizabeth, cady, stanton, general, correspond... | \n", "
1 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-2 | \n", "179713 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "The following four letters are \\r\\nfrom Daniel... | \n", "Peter Smith; Daniel Cady; Judge Cady | \n", "[(The, the, DET, DT, det, Xxx, True, True), (f... | \n", "[(four, 14, 18, CARDINAL), (Daniel Cady, 38, 4... | \n", "[(following, follow, VERB, VBG, amod, xxxx, Tr... | \n", "[(The, the, DET, DT, det, Xxx, True, True), (f... | \n", "[(\\r\\n, \\r\\n, SPACE, _SP, dep, \\r\\n, False, Fa... | \n", "[] | \n", "[] | \n", "[follow, letter, daniel, cady, peter, smith, j... | \n", "
2 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-3 | \n", "179714 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "22 ... | \n", "NaN | \n", "[(22, 22, NUM, CD, ROOT, dd, False, False), ( ... | \n", "[(22, 0, 2, CARDINAL), (2 Dec. 1814, 91, 102, ... | \n", "[(Dec., Dec., PROPN, NNP, npadvmod, Xxx., Fals... | \n", "[(It, it, PRON, PRP, nsubj, Xx, True, True), (... | \n", "[( ... | \n", "[(22, 22, NUM, CD, ROOT, dd, False, False), (2... | \n", "[] | \n", "[dec., dear, sir, true, lose, young, child, th... | \n", "
3 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-4 | \n", "179715 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "he could to make her respectable & happy. That... | \n", "Peter Smith; Bonaparte | \n", "[(he, he, PRON, PRP, nsubj, xx, True, True), (... | \n", "[(one, 467, 470, CARDINAL), (one, 614, 617, CA... | \n", "[(respectable, respectable, ADJ, JJ, ccomp, xx... | \n", "[(he, he, PRON, PRP, nsubj, xx, True, True), (... | \n", "[(&, &, CCONJ, CC, cc, &, False, False), (., .... | \n", "[(2d, 2d, NUM, CD, nummod, dx, False, False), ... | \n", "[] | \n", "[respectable, happy, moment, flatter, soon, se... | \n", "
4 | \n", "Elizabeth Cady Stanton Papers | \n", "General correspondence | \n", "Elizabeth Cady Stanton Papers: General Corresp... | \n", "mss412100001 | \n", "mss412100001-5 | \n", "179716 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Johnstown 2 D Paid 10\\r\\n\\r\\n\\r\\nPeter Smi... | \n", "Peter Smith | \n", "[(Johnstown, Johnstown, PROPN, NNP, nmod, Xxxx... | \n", "[(Johnstown, 0, 9, GPE), (10, 23, 25, CARDINAL... | \n", "[(Johnstown, Johnstown, PROPN, NNP, nmod, Xxxx... | \n", "[] | \n", "[( , , SPACE, _SP, dep, , False, Fa... | \n", "[(2, 2, NUM, CD, nummod, d, False, False), (10... | \n", "[] | \n", "[johnstown, d, paid, peter, smith, esquire, pe... | \n", "
\n", " | Campaign | \n", "Project | \n", "Item | \n", "ItemId | \n", "Asset | \n", "AssetId | \n", "AssetStatus | \n", "DownloadUrl | \n", "Transcription | \n", "Tags | \n", "tokenized_text | \n", "entities | \n", "text | \n", "stop_words | \n", "nonalphanums | \n", "numbers | \n", "ambigs | \n", "processed_text | \n", "
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-1 | \n", "7580 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Office Supplies typewriter ribbons fountain pe... | \n", "Mrs Ella Wheeler Wilcox; Woman Suffrage Conven... | \n", "[(Office, office, NOUN, NN, compound, Xxxxx, T... | \n", "[(Swett, 101, 106, PERSON), (Stationery Blank ... | \n", "[(Office, office, NOUN, NN, compound, Xxxxx, T... | \n", "[(’s, ’s, PART, POS, case, ’x, False, True), (... | \n", "[(\\r\\n, \\r\\n, SPACE, _SP, dep, \\r\\n, False, Fa... | \n", "[(603, 603, NUM, CD, nummod, ddd, False, False... | \n", "[] | \n", "[office, supply, typewriter, ribbon, fountain,... | \n", "
1 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-2 | \n", "7581 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "March 16, Wednesday,1904 - Dr. Booker Washingt... | \n", "Cruger; Calloway; VanRensselaer; Booker; Washi... | \n", "[(March, March, PROPN, NNP, npadvmod, Xxxxx, T... | \n", "[(March 16, 0, 8, DATE), (Booker, 31, 37, PERS... | \n", "[(March, March, PROPN, NNP, npadvmod, Xxxxx, T... | \n", "[(as, as, ADP, IN, prep, xx, True, True), (our... | \n", "[(,, ,, PUNCT, ,, punct, ,, False, False), (-,... | \n", "[(16, 16, NUM, CD, nummod, dd, False, False), ... | \n", "[] | \n", "[march, wednesday,1904, dr., booker, washingto... | \n", "
2 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-3 | \n", "7582 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "Fountain Pens Repaired\\r\\nTablets\\r\\nTypewrite... | \n", "Pennsylvania; committee; Washington Post | \n", "[(Fountain, Fountain, PROPN, NNP, compound, Xx... | \n", "[(Fountain Pens Repaired\\r\\nTablets\\r\\nTypewri... | \n", "[(Fountain, Fountain, PROPN, NNP, compound, Xx... | \n", "[('s, 's, PART, POS, case, 'x, False, True), (... | \n", "[(\\r\\n, \\r\\n, SPACE, _SP, dep, \\r\\n, False, Fa... | \n", "[(603, 603, NUM, CD, nummod, ddd, False, False... | \n", "[(?, ?, ADJ, JJ, punct, ?, False, False), (Wi?... | \n", "[fountain, pens, repaired, tablet, typewriter,... | \n", "
3 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-4 | \n", "7583 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "May, 1904\\r\\n\\r\\n1 SUNDAY Received invitation ... | \n", "NaN | \n", "[(May, May, PROPN, NNP, nmod, Xxx, True, True)... | \n", "[(May, 1904, 0, 9, DATE), (1, 13, 14, CARDINAL... | \n", "[(SUNDAY, SUNDAY, PROPN, NNP, appos, XXXX, Tru... | \n", "[(May, May, PROPN, NNP, nmod, Xxx, True, True)... | \n", "[(,, ,, PUNCT, ,, punct, ,, False, False), (\\r... | \n", "[(1904, 1904, NUM, CD, nummod, dddd, False, Fa... | \n", "[] | \n", "[sunday, receive, invitation, fran, olga, mr, ... | \n", "
4 | \n", "Mary Church Terrell: Advocate for African Amer... | \n", "Address and appointment books | \n", "Mary Church Terrell Papers: Appointment Calend... | \n", "mss425490014 | \n", "mss425490014-5 | \n", "7584 | \n", "completed | \n", "http://tile.loc.gov/image-services/iiif/servic... | \n", "June, 1904\\r\\n\\r\\n7 TUESDAY Reached Bremer Hav... | \n", "Berlin; Congress morning; June 1904; Paris | \n", "[(June, June, PROPN, NNP, npadvmod, Xxxx, True... | \n", "[(June, 1904, 0, 10, DATE), (7, 14, 15, CARDIN... | \n", "[(June, June, PROPN, NNP, npadvmod, Xxxx, True... | \n", "[(in, in, ADP, IN, prep, xx, True, True), (at,... | \n", "[(,, ,, PUNCT, ,, punct, ,, False, False), (\\r... | \n", "[(1904, 1904, NUM, CD, nummod, dddd, False, Fa... | \n", "[] | \n", "[june, tuesday, reach, bremer, haven, morning,... | \n", "