Tuesday, December 16, 2014

Fundamental plot arcs, seen through multidimensional analysis of thousands of TV and movie scripts

It's interesting to look, as I did at my last post, at the plot structure of typical episodes of a TV show as derived through topic models. But while it may help in understanding individual TV shows, the method also shows some promise on a more ambitious goal: understanding the general structural elements that most TV shows and movies draw from. TV and movies scripts are carefully crafted structures: I wrote earlier about how the Simpsons moves away from the school after its first few minutes, for example, and with this larger corpus even individual words frequently show a strong bias towards the front or end of scripts. These crafting shows up in the ways language is distributed through them in time.

So that's what I'm going to do here: make some general observations about the ways that scripts shift thematically. In its own, this stuff is pretty interesting--when I first started analyzing the set, I thought it might an end in itself. But it turns out that by combining those thematic scripts with the topic models, it's possible to do something I find really fascinating, and a little mysterious: you can sketch out, derived from the tens of thousands of hours of dialogue in the corpus, what you could literally call a plot "arc" through multidimensional space.


Words in screen time

First, let's lay the groundwork. Many, many individual words show strong trends towards the beginning or end of scripts. In fact, plotting movies in what I'm calling "screen time" usually has a much more recognizable signature than plotting things in the "historic time" you can explore yourself in the movie bookworm. So what I've done is cut every script there into "twelfths" of a movie or TV show; the charts here show the course of an episode or movie from the first minute at the left to the last one at the right. For example: the phrase "love you" (as in, mostly, "I love you") is most frequent towards the end of movies or TV shows: characters in movies are almost three times more likely to profess their love in the last scene of a movie than in the first.

Thursday, December 11, 2014

Typical TV episodes: visualizing topics in screen time

The most interesting element of the Bookworm browser for movies I wrote about in my last post here is the possibility to delve into the episodic structure of different TV shows by dividing them up by minutes. On my website, I previously wrote about story structures in the Simpsons and a topic model of movies I made using the general-purpose bookworm topic modeling extension. For a description of the corpus or of topic modeling, see those links.

Note: Part II of this series, which goes into quantifying the fundamental shared elements of plot arcs, is now up here.

In this post, I'm going to combine those two projects. What can we see by looking at the different content of TV shows? Are there elements to the ways that TV shows are laid out--common plot structures--that repeat? How thematically different is the end of a show from its beginning? I want to take a first stab at those questions by looking at a couple hundred TV shows and their structure. To do that, I:

1. Divided a corpus of 80,000 movies and TV show episodes into 3 minute chunks, and then divided each show into 12 roughly-equal parts.
2. Generated a 128-topic model where each document is one of those 3-minute chunks, which should help the topics be better geared to what's on screen at any given time.
3. For every TV show, plotted the distribution of the ten most common topics with the y-axis roughly representing percent of dialogue of the show in the topic, and the x-axis corresponding to the twelfth of the show it happened in. So dialogue in minute 55 of a 60-minute show will be in chunk 11.

First a note: these images seem not to display in some browsers. If you want to zoom and can't read the legends, right click and select "view in a new tab."

Let's start by looking at a particularly formulaic show: Law and Order.





The two most common topics in Law & Order are "court case Mr. trial lawyer" and "murder body blood case". Murder is strongest in the first twelfth, when the body is discovered; "court case" doesn't appear in any strength until almost halfway through, after which it grows until it takes up more than half the space by the last twelfth.

That's pretty good straight off: the process accurately captures the central structuring element of the show, which is the handoff from cops to lawyers at the 30 minute mark. (Or really, this suggests, more like the 25 minute mark). Most of the other topics are relatively constant. (It's interesting that the gun topic is constant, actually, but that's another matter). But a few change--we also get a  decrease in the topic "people kid kids talk," capturing some element of the interview process by the cops; a different conversation topic, "talk help take problem," is more associated with the lawyers. Also, the total curve is wider at the end than at the beginning; that's because we're not looking at all the words in Law & Order, just the top ten out of 127 topics. We could infer, preliminarily, that Law and Order is more thematically coherent in the last half hour than the first one: there's a lot of thematic diversity as the detectives roam around New York, but the courtroom half is always the same.

Compare the spinoffs: SVU is almost identical to the Law & Order mothership, but Criminal Intent gets to the courtroom much later and with less intensity.






See below the fold for more. Be warned: I've put a whole bunch of images into this one.

Monday, September 15, 2014

Screen time!

Here's a very fun, and for some purposes, perhaps, a very useful thing: a Bookworm browser that lets you investigate onscreen language in about 87,000 movies and TV shows, encompassing together over 600 million words. (Go follow that link if you want to investigate yourself).

I've been thinking about doing this for years, but some of the interest in my recent Simpsons browser and some leaps and bounds in the Bookworm platform have spurred me to finally lay it out. This comes from a very large collection of closed captions/subtitles from the website opensubtitles.org; thanks very much to them for providing a bulk download.

Just as a set of line charts, this provides a nice window into changing language. I've been interested in the "need to"/"ought to" shift since I wrote about it in Mad Men: it's quite clear in the subtitle corpus, and the ratio is much higher as of 2014 than anything Ngrams can show.

Add caption

Thursday, September 11, 2014

Some links to myself

An FYI, mostly for people following this feed on RSS: I just put up on my home web site a post about applications for the Simpsons Bookworm browser I made. It touches on a bunch of stuff that would usually lead me to post it here. (Really, it hits the Sapping Attention trifecta: a discussion of the best ways of visualizing Dunning Log-Likelihood, cryptic allusions to critical theory; and overly serious discussions of popular TV shows.). But it's even less proofread and edited than what I usually put here, and I've lately been more and more reluctant to post things on a Google site like this, particularly as blogger gets folded more and more into Google Plus. That's one of the big reasons I don't post here as much as I used to, honestly. (Another is that I don't want to worry about embedded javascript). So, head over there if you want to read it.

While I'm at it, I made a few data visualizations last year that I only shared on Twitter, but meant to link to from here: Those are linked from a single place on my web site. My favorite is the baseball leaderboard, the most popular was either the distorted subway maps or the career charts, and the most useful, I think, is the browser of college degrees by school and institution type. There are a couple others as well. (And there are a few not there that I'll add at some point.)

Wednesday, August 13, 2014

Data visualization rules, 1915

Right now people in data visualization tend to be interested in their field’s history, and people in digital humanities tend to be fascinated by data visualization. Doing some research in the National Archives in Washington this summer, I came across an early set of rules for graphic presentation by the Bureau of the Census from February 1915. Given those interests, I thought I’d put that list online.

As you may know, the census bureau is probably the single most important organization for inculcating visual-statistical literacy in the American public, particularly through the institution of the Statistical Atlas of the United States published in various forms between 1870 and 1920.
A page from the 1890 Census Atlas: Library of Congress

Friday, May 23, 2014

Mind the gap: Incomes, college majors, gender, and higher ed reform

People love to talk about how "practical" different college majors are: and practicality is usually majored in dollars. But those measurements can be very problematic, in ways that might have bad implications for higher education. That's what this post is about.

I'll start with a paradox that anyone who talks to young people about their college majors should understand.

Let's say you're going to college to maximize your future earnings. You've read the census report that says your choice of major can make millions of dollars of difference, so you want to pick the right one. In the end, you're deciding between majoring in finance or nursing. Which one makes you more money?

Correction, 5/24/14: I've just realized I made an error in assigning weights that meant the numbers I gave originally in this post were for heads of household only, not all workers. I'm fixing the text with strikethroughs, because that's what people seem to do, and adding new charts while shrinking the originals down dramatically. None of the conclusions are changed by the mistake.

The obvious thing to do is look at median incomes in each field. Limiting to those who work 30hrs a week and are between 30 and 45 years old, you'd get these results. (Which is just the sort of thing that census report tells you).

Original version
Same chart, all workers
Nursing majors make a median of $69,000 $65,000; finance majors make $78,000$70,000.

That means you'll make 13% more as a finance major, right?

Wrong. This is pretty close, instead, to a straightforward case of Simpson's Paradox.* Even though the average finance major makes more than the average nursing major, the average individual will make more in nursing. Read that sentence again: it's bizarre, but true.

How can it be true? Because any individual has to be male or female. (Fine, not really: but for the purposes of government datasets like this, you have to choose one). And when you break down pay by gender, something strange happens:

Original version, head of household only


Male nurses do indeed make less than male finance majors ($72,00085,000 vs $76,00080,000 in median income).
But that's more than offset by the fact that female nurses make much more than their finance counterparts ($64,00067,699.78** vs $57,00061,000). The average person will actually make more with a nursing degree than with finance degree.

So why the difference? Because there are hardly any men who major in nursing, and hardly any women who major in finance, so the median income ends up being about male wages for finance, and about female wages for nursing.
Original version, heads of household only.

The apparent gulf between finance and nursing has nothing to do with the actual majors, and everything to do with the pervasive gender gaps across the American economy.

Like many examples of Simpson's paradox, this has some real-world implications. There's a real push (that census report is just one example) to think of college majors more vocationally. Charts of income by major are omnipresent. There's even a real danger that universities will get some federal regulation using loan repayment rates, which won't be independent of income, to determine what colleges are doing a "good" or "bad" job.

Every newspaper chart or college loan program that doesn't disaggregate by gender is going to make the majors that women choose look worse than the ones that men choose. Think we need more people to major in computer science, engineering, and economics? Think we need fewer sociology, English, and Liberal Arts majors? That's not just saying that high-paying fields are better: it's also saying that the sort of fields women major in more often are less worthwhile.

How important is gender? Very. A male English major probably makes more than makes the same as a female math major, and a female economics major makes less than a male history major. So the next time you see someone arguing that only fools major in art history, remind them that the real thing holding back most English majors in the workplace isn't their degree but systemic discrimination against their sex in the American economy.***

By the way: you might be thinking, "That's great: the ACS includes major, now we have some real evidence." You shouldn't. Data collection isn't apolitical. The reason that the ACS includes major is because the state has turned its gaze to college major as a conceivable area of government regulation. We're going to get a lot of thoughtlessly anti-humanities findings out of this set: For example, that census department report grandly concluded that people who major in the humanities are much less likely to find full-time, year-round employment, while burying in a footnote that schoolteachers--the top or second-most common job for most humanities majors--don't count as year-round employees because they take the summer off. **** So, brace yourself. One of the big red herrings will be focusing on earnings for 23-year-olds; this ignores both the fact that law (which you can't start until age 26) is a common and lucrative destination for humanities majors, but also that liberal arts majors catch up, since their skills (to speak of it instrumentally) don't atrophy as quickly. Not to mention all the non-pecuniary rewards.*****

So one of the big challenges over the next few years for advocates of fields that include a lot of women (which includes psychology, education, and communication, as well as many of the humanities) is going to be sussing out the implication of the gender gap for proposed policies and regulations. A perfectly crafted higher ed policy would, of course, take this into account: but it's extremely unlikely that we'll get one of those, if indeed we need one. It would be a bitterly ironic outcome if attempts to fix college majors ended up rewarding fields like computer science for becoming systematically less friendly to women over the last few decades.

This isn't to say there aren't real effects: pharmacology and electrical engineering majors do make more money, certainly, than arts or communications majors. But while the gender disparity is a massive, critical element to every discussion of wages, it's not the only thing lurking behind these numbers. (I've only imperfected adjusted for age, for example).

So I'm reluctant to give the average incomes at all, since I suspect that even with gender factored out they might confuse us. Still, it's worth thinking about. So here they are: a chart of the most common majors, showing median income for men and women: the arrows show the shift from the actual median income to what it would be if both genders were equally represented.

Original version: heads of household only.



*Actually, this isn't a perfect case of Simpson's paradox, because the male rate is indeed lower; there's a third variable at play here, the size of the gender gap within each field: although it's everywhere, the gender gap isn't necessarily the same size.

**Median incomes usually come out as round numbers, because most people report it approximately; but sometimes, as here, they don't.

***I don't actually recommend you do precisely that, from a lobbying perspective. 

**** That's why I've made the somewhat questionable choice of not reducing the set down to "full time year round" workers as is conventional: instead, I'm using the weaker filter of persons under 60 with a college degree who worked at least 30 hours a week. 

***** Which, yes, I believe are more important than the few thousand dollars you might get by agreeing to sell pharmaceuticals the rest of your life. But it's critically important not to just cede the field on less exalted measured of success.

Thursday, April 3, 2014

Biblio bizarre: who publishes in Google Books

Here's a little irony I've been meaning to post. Large scale book digitization makes tools like Ngrams possible; but it also makes tools like Ngrams obsolete for the future. It changes what a "book" is in ways that makes the selection criteria for Ngrams—if it made it into print, it must have some significance—completely meaningless.

So as interesting as Google Ngrams is for all sorts of purposes, it seems it might always end right in 2008. (I could have sworn the 2012 update included through 2011 in some collections; but all seem to end in 2008 now.)

Lo: the Ngram chart of three major publishers, showing the percentage of times each is mentioned compared to all other words in the corpus:


Monday, March 31, 2014

Shipping maps and how states see

A map I put up a year and a half ago went viral this winter; it shows the paths taken by ships in the US Maury collection of the ICOADS database. I've had several requests for higher-quality versions: I had some up already, but I just put up on Flickr a basically comparable high resolution version. US Maury is "Deck 701" in the ICOADS collection: I also put up charts for all of the other decks with fewer than 3,000,000 points. You can page through them below, or download the high quality versions from Flickr directly. (At the time of posting, you have to click on the three dots to get through to the summaries).



I've also had a lot of questions about modern day equivalents to that chart. This, it turns out, is an absolutely fascinating question, because it forces a set of questions about what the Maury chart actually shows. Of course, on the surface, it seems to show 19th century shipping routes: that's the primary reason it's interesting. But it's an obviously incomplete, obviously biased, and obviously fragmentary view of those routes. It's a relatively complete view, on the other hand, of something more restricted but nearly as interesting: the way that the 19th century American state was able to see and take measure of the world. No one, today, needs to be told that patterns of state surveillance, data collection, and storage are immensely important. Charts like these provide an interesting and important locus for seeing how states "saw," to commandeer a phrase from James Scott.

So I want to explore a couple of these decks as snapshots of state knowledge that show different periods in the ways states collected knowledge as data. In my earlier pieces on shipping, I argued that data narratives should eschew individual stories to describe systems and collectives. States are one of the most important of these collectives, and they have a way of knowing that is at once far more detailed and far more impoverished than the bureaucrats who collect for them. These data snapshots are fascinating and revealing snapshots of how the state used to and continues to pull in information from the world. (More practically, this post is also a bit of mulling over some questions for talks I'll be giving at the University of Nebraska on April 11 and the University of Georgia on April 22st--if you're in either area, come on down. Some of the unanswered questions may be filled in by then.)


Wednesday, June 26, 2013

Crisis in the humanities, or just women in the workplace?

OK: one last post about enrollments, since the statistic that humanities degrees have dropped in half since 1970 is all over the news the last two weeks. This is going to be a bit of a data dump: but there's a shortage of data on the topic out there, so forgive me.

In my last two posts, I made two claims about that aspect of the humanities "crisis:"

1) The biggest drop in humanities degrees relative to other degrees in the last 50 years happened between 1970 and 1985; the lower level over the last 25 years is not far out of line with pre-1960 levels of humanities majors (and far exceeds it if you account for population).

2) The entirety of the long term decline from 1950 to the present has to do with the changing majors of women, not of men.

To understand where the long-term parts of the crisis come from, that implies, you have to look at what women used to major in, and how those majors have changed. That's what this post is about.

Gender and the long-term decline in humanities enrollments

A quick addendum to my post on long-term enrollment trends in the humanities. (This topic seems to have legs, and I have lots more numbers sitting around I find useful, but they've got to wait for now).

David Brooks claimed in the Times, responding to the American Academy's report on the humanities and social sciences, that the humanities "commited suicide" by focusing on "class, race and gender" instead of appealing to "the earnest 19-year-old with lofty dreams of self-understanding and moral greatness." There's a lot wrong with this argument. Most of it is obvious from information already on the Internet. (David Silbey notes some of it vis-a-vis my last stats here.)

The most ironic part, though, is that hard-to-find data about the structural role of gender in university enrollments makes nonsense of Brooks' narrative that the humanities were undone by studying gender. Government bureaucrats have always been careful, though, to segregate degrees by gender in their reports.* Those are the reports I typed up to get the trend lines back to 1948 for my last post.

*I don't want to put off any earnest 19-year-olds out there: but one might argue that a persistent state interest in segregating educational achievement by gender suggests a certain degree of, shall we say, purposeful reproduction of sexual difference as a category of exclusion by the state. 

If Brooks is right, one would expect a general decline in enrollments since the 1950s. But the long term results actually show that since 1950, only women have shown a major drop in the percentage of humanities majors. (And keep in mind that the college population has increased dramatically in this period: this is just about college students). Men are just as likely (7%) to major in the humanities as they were in 1950, although there was a large spike in the 1960s.


Friday, June 7, 2013

Some long term perspective on the "crisis" in humanities enrollment

There was an article in the Wall Street Journal about low enrollments in the humanities yesterday. The heart of the story is that the humanities resemble the late Roman Empire, teetering on a collapse precipitated by their inability to get jobs like those computer scientists can provide. (Never mind that the news hook is a Harvard report about declining enrollments in the humanities, which makes pretty clear that the real problem is students who are drawn to social sciences, not competitition from computer scientists.)

But to really sell a crisis, you need some numbers. Accompanying this was a graph credited to the American Academy of Arts and Sciences showing a spectacular collapse in humanities enrollments. I happen to have made one of the first versions of this chart working there several years ago. Although it shows up in the press periodically to enforce a story of decay, some broader perspective on the data makes clear that the "Humanities in crisis" story has the wrong interpretation, the wrong baseline, and the wrong denominator.


Friday, May 24, 2013

Turning-point years in history

What are the major turning points in history? One way to think about that is to simply look at the most frequent dates used to start or end dissertation periods.* That gives a good sense of the general shape of time.

*For a bit more about how that works, see my post on the years covered by history dissertations: I should note I'm using a better metric now that correctly gets the end year out of text strings like "1848-61."

Here's what that list looks like: the most common year used in dissertation titles. It's extremely spiky--some years are a lot more common than are others.



Thursday, May 9, 2013

What years do historians write about?

Here's some inside baseball: the trends in periodization in history dissertations since the beginning of the American historical profession. A few months ago, Rob Townsend, who until recently kept everyone extremely well informed about professional trends at American Historical Association* sent me the list of all dissertation titles in history the American Historical Association knows about from the last 120 years. (It's incomplete in some interesting ways, but that's a topic for another day). It's textual data. But sometimes the most interesting textual data to analyze quantitatively are the numbers that show up. Using a Bookworm database, I just pulled out from the titles the any years mentioned: that lets us what periods of the past historians have been the most interested in, and what sort of periods they've described..

*Townsend is now moving on to the American Academy of Arts and Sciences, where I'm excited to see that he'll manage the Humanities Indicators—my first real programming/data project was putting together the first version of them together with Malcolm Richardson immediately after college.



Numbers between 500 and 2000 are almost always years. You can see here that the vast bulk of historical study has been in the period since 1750: the three spikes out of the landscape correspond to the Civil War and the two world wars. Output decreases in the late 20th century in large part because the data set goes back to about 1850; but as we'll see in the next chart, not entirely.

Friday, April 12, 2013

How not to topic model: an introduction for humanists.

The new issue of the Journal of Digital Humanities is up as of yesterday: it includes an article of mine, "Words Alone," on the limits of topic modeling. In true JDH practice, it draws on my two previous posts on topic modeling, here and here. If you haven't read those, the JDH article is now the place to go. (Unless you love reading prose chock full've contractions and typos. Then you can stay in the originals.) If you have read them, you might want to know what's new or why I asked the JDH editors to let me push those two articles together. In the end, the changes ended up being pretty substantial.

Friday, March 29, 2013

Patchwork Libraries

The hardest thing about studying massive quantities of digital texts is knowing just what texts you have. This is knowlege that we haven't been particularly good at collecting, or at valuing.


The post I wrote two years ago about the Google Ngram chart for 02138 (the zip code for Cambridge, MA) remains a touchstone for me because it shows the ways that materiality, copyright, and institutional decisions can produce data artifacts that are at once inscrutable and completely understandable. (Here's the chart--go read the original post for the explanation.)



Since then, I've talked a lot about the need to understand both the classification schemes for individual libraries, and the need to understand the complicated historical provenance of the digital sources we use.

What I haven't done is give a real account of the sources of the books in Bookworm. That's pretty hypocritical. I apologize. I decided some time ago to use the Open Library to provide all the metadata for Bookworm. It has very good, high-quality library metadata; but it doesn't indicate where the metadata, or the volume itself, came from. (The meta-metadata, if you will, is not great.) That's an ontological issue. Open Library describes "editions," so there's no space for a field that tells you where an individual pdf or text volume came from.

I recently loaded in this library data, though. The Internet Archive web site, which stores the actual books, does say where a particular volume comes from. So with a little behind the scenes magic, it's pretty easy to get that into the database in some form.

I've been learning some D3 mock up a possible multi-dimensional interfaces to the Bookworm data through the API. (If you come to the Bookworm booth at the DPLA meeting next month, you can play with this as well as some new projects others in the lab have been building on the platform).

Using that, it's possible to quickly mock up a chart of the... (drumroll)...

 Library origins of the books in Bookworm. (Most common libraries only)
Click to enlarge
The colors are on a log scale here, and each little line represents a single combination of a library and year. So a red horizontal band is an area where a library has contributed hundreds of books to the Internet Archive (and therefore Bookworm) each year, a yellow to orange band means dozens of books a year, and the areas of scattered green show libraries that only contribute a volume or two every few years. I only show the most common libraries.

What can we see here? The number one contributor is the University of California Libraries. (No surprise). The  Robarts library at the University of Toronto is number two--that's only surprising to me because there have been so few books published in Canada when I've segmented by country. Most of the other large libraries are the ones you would expectboth some of the big university/Google partner libraries (Harvard, Oxford, Michigan, NYPL) and a few free agents who scanned on their own or in cooperation with the Internet Archive (Library of Congress, Boston Public (I think)).

The temporal patterns are more interesting. The 1922 cutoff line is the dominant visual feature of the chart--the extremely different composition of the lending libraries after that date (MIT and Montana State suddenly become important players, while sources like Cal and the LOC vanish). This is why comparisons across 1922 are never safe, even when the numbers seem big enough. But the complicated nature of the copyright line and scanning policies is also clear: for Canadiana.org, it's 1900, and for Oxford, it seems to be in the 1880s. There are clearly some metadata problems, as well (I think a lot of those post-1922 Harvard books are misattributed).

There are also a lot of strange little clusters of intense collection. Duke and the Lincoln Financial collection have massive spikes from about 1861-1865; the Boston Public Library seems to have an abnormal number of books right around 1620 and 1776. Those are specifically historical collections that will affect the sort of conclusions one can draw from aggregate data, much like the questions I was asking about a Google presentation that purported to show that more history was published in revolutionary times). Although oddly enough, the first few BPL volumes from 1620 I checked were in French or Latin--it's not quite the model of founding charters that I expected.

The digital libraries we're building are immense patchworks. That means they have seams. It's possible to tell stories that rest lightly across the many elements. But push too hard in the wrong direction, and conclusions can unravel. I don't think that any conclusions I've made in the past are unsettled by knowing where the books came from--although that Oxford dropoff has me wondering what might happen to transatlantic comparisons. But I'm glad to be able to see it, and so wanted to share. This is something we need to do.

And this isn't an issue particular to just Bookworm, although it does have a slightly more tangled line of transmission than some other libraries. Here, for example, are the top 16 libraries in the Hathi trust catalog from 1780 to 1820, scraped from their web catalog. 1800 is not a year that dramatically changes the publication history of books; but it is a year that librarians will use for their own arbitrary purposes. For instance, a book published in 1799 is considerably more likely to have been "rare" and therefore off limits to the Google scanners in the period of book scanning, I'd bet anything, just because human beings use rules of thumb like that; or in the period that university libraries were being built, acquisition policies may have been quite different for pre-1800 vs post-1800 texts in various genres. ("Don't buy any science books written before 1800" would be a sensible policy to adopt in 1870, for example. Almost all the libraries here would not have bought the bulk of their 1799 collections in the year 1799.)
Click to enlarge
I pulled out the year 1800 (1800 and 1900 are both massive spikes where all sorts of unknown books can be filed). That also serves to highlight the gaps from 1799 to 1801; they're sometimes quite significant. What you see here are the same sort of seams and discontinuities as in Bookworm, albeit on a smaller scale. The number of volumes from NYPL, Cal, and Harvard double overnight; other libraries, like Michigan and Madrid, seem not to show any pattern. If Michigan and California collect different sorts of books, this can cause major headaches for comparisons across the line.

There's no grand lesson here. Or if there is, it's just the old one: know your sources. But as a public service announcement, it's worth making again and again.