Obama’s Speeches

Il G8 di Genova nei titoli dei quotidiani italiani
3 December 2017
US Presidential Inaugural Addresses (from 1957 to 2009)
20 February 2018

Obama’s Speeches

T-LAB Tools for Text Analysis



Obama's Speeches

Short Sample by Franco Lancia (franco.lancia@tlab.it)
March 14, 2007

On February 10 (2007) I followed Senator Barack Obama's speech on TV where he announced his candidacy for President of United States in Springfield.

Because I was impressed by his ability in communicating his emotions and ideas, I decided to spend some time in order to better understand his political vision.

On March 6, I downloaded all 77 speeches of B. Obama available from the website http://obama.senate.gov/speech/ and I imported them by T-LAB.

Then, in order to make a method of text analysis public, I decided to keep a record of my steps and to publish a short report of my exploratory route.

Just some short remarks about the preliminary treatments:
- an automatic lemmatization has been applied (e.g. the headword "hope" stands for "hope", "hopes", hoping", "hoped");
- the most important multiword expressions have been detected and transformed (e.g. "Civil Rights Movement" -> Civil_Rights_Movement");
- the elementary contexts for co-coccurrence analysis have been set as text fragments of comparable length including one or more sentences;
- a wordlist of 1446 items (i.e. lemmas) with occurrence values equal or superior to 8 has been used.

Firstly, in order to get a initial representation of the data structure, I performed a Correspondence Analysis (*) of a word x speech contingency table, that is of a matrix consisting of 1446 rows and 77 columns whose cells contain frequency values.

(*) This T-LAB tool is available in the sub-menu "Comparative Analysis".

The following charts show the obtained results mapped on the first two factorial axes.

Figure 1 - The most significant keywords

Figure 2 - The 77 speeches

In summary, the main topics of analysed speeches turn out organized in the following way:

Figure 3 - A 4-way scheme

In detail, a precise description of these main topics is provided by the following T-LAB tables which include the first 30 words for each pole, sorted by their decreasing test-value.

Figure 4 - Word Test Values

Some comments:

The first (horizontal) axis opposes "internal" and "external" political topics (see "education" vs "war"), whereas the second (vertical) axis - through the reference to climate changes - seems to create a bridge between the two first polarities.

These simplified results depend on the method of analysis in two respects:
a) the choice of analysis units, that is the construction of a table having as many columns as there are different speeches;
b) the choice of a specific statistical tool, that is the correspondence analysis.

From a methodological point of view, to analyse the same table (to be precise, its transposed) I could have used a typical algorithm of document clustering (see the T-LAB tool Thematic Document Classification), but - as I had the chance to verify - the results would be very similar. Starting from 77 speeches/documents, because of the reduction process, the clusters obtained are 4, that is not enough for producing an articulate thematic map.

Differently, by changing the analysis units, that is by asking T-LAB to build and analyse a table with as many rows as the elementary contexts (in this case, more than three thousand), I obtained a map of 12 thematic clusters.

The logic of this T-LAB tool (Thematic Analysis of Elementary Contexts) is explained in the user's manual and in the on-line help, both available from the T-LAB web site (www.tlab.it). Briefly, it combines two kinds of analysis: the first uses a clustering algorithm for discovering thematic groups of elementary contexts sharing similar word co-occurrence patterns; the second builds a contingency table words x clusters and maps its structure by means of the correspondence analysis.

The following charts show the relationships of the twelve clusters within the first bi-dimensional space (Figure 5) and their relative weight (Figure 6).


Figure 5 - Scatter chart with 12 thematic clusters


Figure 6 - Histogram of 12 thematic clusters

In some ways, the structure of the bi-dimensional spaces obtained by means of two different tools are very similar (see Fig. 1 and Fig. 5). But, in last case (Fig. 5), because each label stands for a cluster consisting of elementary contexts sharing similar word patterns and T-LAB makes tables with the characteristics of each cluster available, we can take a look at the characteristics of each of them.

For example, the following three tables report the characteristics of as many thematic clusters

Figure 7 - Thematic cluster HEALTH-CARE
Click here to show the complete output of this cluster

Figure 8 - Thematic cluster IRAQ WAR
Click here to show the complete output of this cluster

Figure 9- Thematic cluster OIL & ENERGY
Click here to show the complete output of this cluster

We can summarize the "content" of the thematic clusters even using a list (see below) which includes the two most significant elementary contexts of each of them, i.e. the elementary contexts to which - within each cluster - T-LAB assigned the highest scores.
In some ways, the following table seems to propose a summary of a political manifesto.


When a parent takes parental leave, we shouldn't act like caring for a newborn baby is a three-month break - we should let them keep their salary . When parents are working and their children need care, we should make sure that care is affordable, and we should make sure our kids can go to school earlier and longer so they have a safe place to learn while their parents are at work
The amendment is simple: it says that the children of low-income working parents affected by Hurricane Katrina will no longer be denied the child credit . You work, your kids get a benefit . If you don't work, no benefit . And if you want the full benefit , you have to earn at least $ 10,000, which is just about the income of a full time job at minimum wage .


Drawing down our troops in Iraq will allow us to redeploy additional troops to Northern Iraq and elsewhere in the region as an over-the-horizon force . This force could help prevent the conflict in Iraq from becoming a wider war, consolidate gains in Northern Iraq, reassure allies in the Gulf , allow our troops to strike directly at al Qaeda wherever it may exist ...
this redeployment remains our best leverage to pressure the Iraqi government to achieve the political settlement between its warring factions that can slow the bloodshed and promote stability . My plan also allows for a limited number of U.S. troops to remain and prevent Iraq from becoming a haven for international terrorism and reduce the risk of all-out chaos.


Our economic dominance has depended on individual initiative and belief in the free market; but it has also depended on our sense of mutual regard for each other, the idea that everybody has a stake in the country, that we're all in it together and everybody 's got a shot at opportunity And so if we're serious about this opportunity
Yes , our greatness as a nation has depended on individual initiative, on a belief in the free market . But it has also depended on our sense of mutual regard for each other , the idea that everybody has a stake in the country , that we're all in it together and everybody ' s got a shot at opportunity . Robert Kennedy reminded us of this . He reminds us still .


In the Lochner case , and in a whole series of cases prior to Lochner being overturned , the Supreme Court consistently overturned basic measures like minimum wage laws, child labor safety laws , and rights to organize, deeming those laws as somehow violating a constitutional right to private property .
Let me just give you a couple examples . In a case reviewing California's parental notification law, Justice Brown criticized the California Supreme Court decision overturning that law, saying that the court should have remained tentative, recognizing the primacy of legislative prerogatives.


Schools that raise student achievement would be given bonuses . For schools that don't improve, the districts would close them and replace them with new, smaller schools that can replicate some of the successful reforms taking place elsewhere.
To hold schools and teachers accountable for the results of all these reforms, Innovation Districts would be asked to support schools that succeed and shut down those that don't . To find out what works and what doesn't , we'd provide them with powerful data and technology, and also give them the option of partnering with local universities to help them improve performance..


I imagine that they would 've seen the marchers and heard the speeches, but they also probably saw the dogs and the fire hoses, or the footage of innocent people being beaten within an inch of their lives; or heard the news the day those four little girls died when someone threw a bomb into their church .
And in that movement, she saw women who were willing to walk instead of ride the bus after a day of doing somebody else's laundry and looking after somebody else's children because they walked for freedom . And she saw young people of every race and every creed take a bus down to Mississippi and Alabama to register voters because they believed .


I thank the managers of this bill , Senators McConnell and Leahy, and their staffs for working with me on this important issue. I know that Senator McConnell has a longstanding interest in Southeast Asia, and Senator Leahy has always been a champion of international health issues, making the avian flu something I know they both care deeply about.
So last November , we introduced an amendment to the tax reconciliation bill expressing the Sense of the Senate that FEMA should immediately rebid these contracts. Our colleagues agreed and passed this amendment by unanimous consent . After our amendment passed, both Senator Coburn and I met with Director Paulison , and again he assured us that these contracts would be rebid .


Even people who didn't know me were skeptical of my decision. I remember having a conversation with an older man I had met before I arrived in Chicago. I told him about my plans, and he looked at me and said , "Let me tell something . You look like a nice clean-cut young man, and you've got a nice voice.
And yet , somehow , we're still hearing stories like the one I heard from a veteran named Bill Allen , who told me that on a trip to Chicago, he actually saw homeless veterans fighting over access to the dumpsters . That 's what I thought about . And finally, I thought about a young man named Seamus Ahern, who I met during the campaign at a V. F .W .


Each and every one of these challenges call for an America that'is more purposeful, more grown-up than the America that we have today . An America that reflects the lessons that have helped so many of its people mature in their own lives. An America that 's about not just each of us, but all of us. An America that takes great risks in the face of greater odds .
That 's as true today as it was then - the real job of organizing working America politics and policy, vision and mission, heart and soul - belongs to each of you . And if you have the courage to succeed, labor will rise again. America will rise again. And hope will rise again. Thank you and God Bless you.


More and more , Americans are competing for these jobs with highly educated workers from India, China, and all over the world. If we want America to win in this new global economy, we have to start sending more kids to college, not less.
instant messaging with friends across the world - a quiet revolution has been breaking down barriers and connecting the world 's economies. Now, businesses not only have the ability to move jobs wherever there's a factory, but wherever there's an internet connection.


But by bringing our health care system on-line, we could start improving the quality of care and cutting the cost of it . We could save thousands of lives and save families billions of dollars. Just imagine if every doctor and nurse could sit by a patient 's bedside with a laptop and pull up their entire medical history - information from every past doctor they've seen - ..
From the smallest mom and pop stores to major corporations like GM , businesses who can't afford these rising costs are cutting back on insurance, workers, or both. States with bigger Medicaid bills and smaller budgets are being forced to choose whether they want their citizens to be unhealthy or uneducated. And over half of all family bankruptcies today are


Recently , I joined a few other Senators in introducing a bill that would increase America's renewable fuel standard and increase ethanol production along with it. A bill like this that 's already passed the Senate twice would 've provided us with 500,000 barrels a day of refined ethanol for use in gasoline and would save us $ 4 billion every year in imported oil and gasoline costs
The President 's energy proposal would reduce our oil imports by 4 .5 million barrels per day by 2025 . Not only can we do better than that , we must do better than that if we hope to make a real dent in our oil dependency . With technology we have on the shelves right now and fuels we can grow right here in America , by 2025 we can reduce our oil imports by over 7.5 .


Click here to show a complete summary of all twelve clusters.

At this point, the main topics of Obama speeches are duly mapped, both those concerning the development of human and technological resources, and those concerning the security and the defence of people's rights. Looking at figures 5 ad 6 we can notice that the majority of the topics, even if expressed in a universal language, concern the domestic policy and are addressed to common people.

Because the thematic partition into twelve clusters can be saved, by using other T-LAB tools we can investigate further relationships between and within them.

A first kind of relationship, concerning the discourse transitions from a theme (i.e. thematic cluster) to the others, can be explored by using a tool which performs a Markovian analysis of the Sequences of Themes.

Some of its typical outputs are as follows:

Figure 10 - Adjacency Matrix of Thematic Clusters

Figure 11 - Predecessors and Successors of CHALLENGE theme

N.B. In this table the "PROB" values indicate the probability of each theme of coming before (predecessor) or after (successor) the selected item within the discouse sequence.

By manipulating these kinds of outputs and using other software, it is possible to produce graphs like the following:

Figure 12 - Network of the main interconnections between all thematic clusters

Figure 13 - Interconnection between CHALLENGE and the other thematic clusters.

A second kind of relationship, concerning the word co-occurrences within each cluster, can be explored by using the Word Associations tool.

For example we can compare the different contextual meaning of Obama's keyword "hope", within the entire corpus and within some thematic clusters, also by extracting some sentences with significant word co-occurrences.

N.B. The enclosed tables report the association measures, i.e. the cosinus coefficients.

Here are the word associations of Hope within the whole corpus:

Here are the word associations of Hope within the CHALLENGE thematic cluster

Here are the word associations of Hope within the OIL & ENERGY thematic cluster


Here are the word associations of Hope within the IRAQ WAR thematic cluster

Here are the word associations of Hope within the OUR COUNTRY thematic cluster


Different exploratory routes could be possible either using other T-LAB tools or other software; but, because I'm not a researcher in political science, I leave the job of a more accurate analysis and of the data interpretation to more competent people.
However, on the basis of what I understood, as a world citizen I "hope" that American people will have the courage to build a bridge towards the future "dreamed" by Senator B. Obama.