In this article we have sought to construct a computational model of SOC style to study its diffusion as a world literary form. We find support for our initial hypothesis that SOC followed a wavelike pattern of dispersion from the world literary system’s core to its semiperiphery and periphery. Yet, at each stage of our analysis, our model charts the broad contours of this diffusion while exposing how this diffusion is marked by constant, heterogeneous variance. We do not see a single, monolithic pattern of diffusion but patterns of dissemination. In other words, we find patterns of difference (or variation) in sameness. This is an idea that is not extrinsic to computational or statistical methods but is deeply embedded in them. Indeed, among humanists, a common misunderstanding of modern statistical modeling is that quantitative models seek to explain everything about a social phenomenon and leave no room for interpretive ambiguity or indeterminacy. The opposite is true. A key feature of every statistical model is an error term that captures precisely what the model cannot explain. Moreover, a common reflexive technique in modeling is to estimate a model’s own inability to fully measure the underlying processes that generate a data set. Modeling is thus deeply invested in indeterminacy, whether of itself or of the data to which it is applied. [Long + So 364–5]
Our habit of doubting ourselves, echoing our earnest but never conclusive efforts to address the misgivings of others, shows more than anything else how at home are digital humanists in the humanities. Indeed, we are perfectly capable of sustaining all sides of this debate without encouragement. Of course I speak as someone to whom this particular problem is merely “academic” (understand this word in a weirdly reversed sense), as I am not personally dependent, at least for the present, on academic funding, either “soft” or “hard”. As soon as the discussion became consequential, I suppose I might be reluctant to take issue with any administrator or committee charged with managing a budget or prioritizing line items. Their jobs are difficult enough, I imagine. Yet it is with considerable astonishment that I read accounts of Lord Browne’s Report (again, this is easy enough for readers to learn about online if they don’t already know too much) and its promise to make British universities more “competitive” (sic) by decimating funding for education in arts, language and literacy. Is this the way a great nation treats its children, I wonder? What would Dr Arnold think? [Piez]
Is it appropriate to deploy positivistic techniques against those self-same positivistic techniques? In a former time, such criticism would not have been valid or even necessary. Marx was writ ing against a system that laid no specific claims to the apparatus of knowledge production itself-even if it was fueled by a persistent and pernicious form of ideological misrecognition. Yet, today the state of affairs is entirely reversed. The new spirit of capitalism is found in brainwork, self-measurement and self-fashioning, perpetual critique and innovation, data cre ation and extraction. In short, doing capitalist work and doing intellectual work-of any variety, bourgeois or progressive-are more aligned today than they have ever been. [Galloway 110]
In anatomical distant reading, the reason that the scholar is refraining from critical close reading is its uneconomical time-consuming quality and its deliberately narrow focus, but in the other case, distant reading is the only way for the scholar to extend the scope of his study without the anxiety of impossibility of an ideal knowledge. As a consequence, in the former, the distant reader’s inductive reasoning guides the direction of the study, but in the latter, it is the constellation of previous studies that directs the inductive reasoning. When the comparatist has attentively distanced his focus of study from a limited number of works, his mastership over those works is still preserved, thus he has the power to direct the path of his inferences. But in the other case, the scholar’s access to the object of study is already funneled, and his only way through it is to pass the filter of others’ researches. Moretti’s idea of collaborative research is problematic not just for implication of the political hegemony of English or the imperialistic attitude of the comparatist, as Arac noted, but also regarding whether the role of induction is placed primary or secondary. There is a huge difference between using other scholars’ works as source of inspiration, influence, or acknowledgment, and treating them as “data”. [Khadem 415]
For literary research to be possible, there must be some ambiguous space between patterns that are transparently legible in our memories and patterns that are too diffuse to matter. In fact this ambiguous space is large and important. We often dimly intuit literary-historical patterns without being able to describe them well or place them precisely on a timeline. For instance, students may say that they like contemporary fiction because it has more action than older books. I suspect that changes in pacing are part of what they mean. There is actually plenty of violence in Robinson Crusoe (1719), but it tends to be described from a distance, in summaries that cover an hour or two. We don’t see Crusoe’s fingers slipping, one by one, off the edge of a cliff. Twentieth-century fiction is closer to the pace of dramatic presentation. Protagonists hold their breath; their heartbeat accelerates; time seems to stand still. This pace may feel more like action, or even (paradoxically) faster, although diegetic time is actually passing more slowly from one page to the next. Even high-school students can feel this difference, although they may not describe it well. Narratologists have described it well, but typically credit it, mistakenly, to modernism. This change is a real literary phenomenon—in fact a huge one. But to trace its history accurately we need numbers. [Underwood 349]
The essence of the traditional humanist’s work-style is illuminated by comparing the pace and character of research publication across the disciplines, as Thomas Kuhn suggested years ago from his own experience (1977: 8–10). It varies widely, from the rapid exchange of results in the sciences to the slower pace of argument in the humanities. To varying degrees within the humanities themselves, this argument is the locus of action: the research itself (e.g. in philosophy) or its synthesis into a disciplinary contribution (e.g. in history) takes place during the writing, in the essay or monograph, rather than in a non-verbal medium, such as a particle accelerator. Contrast, as Kuhn did, the traditional research publication in experimental physics, which reports on results obtained elsewhere. In the natural sciences, as that ‘elsewhere’ has shifted from the solitary researcher’s laboratory bench to shared, sometimes massive equipment or through a division of labour to the benches of many researchers, collaboration has become a necessity. In the humanities, scholars have tended to be physically alone when at work because their primary epistemic activity is the writing, which by nature tends to be a solitary activity. Humanists have thus been intellectually sociable in a different mode from their laboratory-bound colleagues in the sciences.
If we look closely at this solitary work, we have no trouble seeing that the normal environment has always been and is virtually communal, formerly in the traditional sense of ‘virtually’ — ‘in essence or effect’ — and now, increasingly, in the digital sense as well. However far back in time one looks, scholarly correspondence attests to the communal sense of work. So do the conventions of acknowledgement, reference and bibliography; the crucial importance of audience; the centrality of the library; the physical design of the book; the meaning of publication, literally to ‘make public’; the dominant ideal of the socalled ‘plain style’, non sibi sed omnibus, ‘not for oneself but for all’; and of course language itself, which, as Wittgenstein argued in Philosophical Investigations, cannot be private. Writing only looks like a lonely act. [McCarty 12–13]
Ten years ago, I had a brief flirtation with another kind of unbifurcated garment, the Utilikilt, and discovered the joy of an undivided Y-axis; now having learned the pleasures of an unbroken X-axis, I’m contemplating trying out a muu-muu, just to see what it’s like when all bifurcations are dispensed with. If you find me walking around town wearing a trashbag with a neckhole and two armholes cut out, you have my permission to send me home and make me put on some proper clothing. [Doctorow]
The sort of person who jumps in and gives advice to the masses without doing a lot of research first generally believes that you should jump in and do things without doing a lot of research first. [West]
The Red Tribe is most classically typified by conservative political beliefs, strong evangelical religious beliefs, creationism, opposing gay marriage, owning guns, eating steak, drinking Coca-Cola, driving SUVs, watching lots of TV, enjoying American football, getting conspicuously upset about terrorists and commies, marrying early, divorcing early, shouting “USA IS NUMBER ONE!!!”, and listening to country music.
The Blue Tribe is most classically typified by liberal political beliefs, vague agnosticism, supporting gay rights, thinking guns are barbaric, eating arugula, drinking fancy bottled water, driving Priuses, reading lots of books, being highly educated, mocking American football, feeling vaguely like they should like soccer but never really being able to get into it, getting conspicuously upset about sexists and bigots, marrying later, constantly pointing out how much more civilized European countries are than America, and listening to “everything except country”.
(There is a partly-formed attempt to spin off a Grey Tribe typified by libertarian political beliefs, Dawkins-style atheism, vague annoyance that the question of gay rights even comes up, eating paleo, drinking Soylent, calling in rides on Uber, reading lots of blogs, calling American football “sportsball”, getting conspicuously upset about the War on Drugs and the NSA, and listening to filk — but for our current purposes this is a distraction and they can safely be considered part of the Blue Tribe most of the time) [Alexander]
Reinforcement learners take a reward function and optimize it; unfortunately, it’s not clear where to get a reward function that faithfully tracks what we care about. That’s a key source of safety concerns.
By contrast, AlphaGo Zero takes a policy-improvement-operator (like MCTS) and converges towards a fixed point of that operator. If we can find a way to improve a policy while preserving its alignment, then we can apply the same algorithm in order to get very powerful but aligned strategies.
Using MCTS to achieve a simple goal in the real world wouldn’t preserve alignment, so it doesn’t fit the bill. But “think longer” might. As long as we start with a policy that is close enough to being aligned — a policy that “wants” to be aligned, in some sense — allowing it to think longer may make it both smarter and more aligned. [Christiano]
It’s also possible to take this reasoning to an extreme—to become radically pessimistic about the consequences of pessimism. In “Suicide of the West,” the conservative intellectual Jonah Goldberg argues that progressive activists—deluded by wokeness into the false belief that Western civilization has made the world worse—are systematically dismantling the institutions fundamental to an enlightened society, such as individualism, capitalism, and free speech. (“Sometimes ingratitude is enough to destroy a civilization,” Goldberg writes.) On the left, a parallel attitude holds sway. Progressives fear the stereotypical paranoid conservative—a nativist, arsenal-assembling prepper whose world view has been formed by Fox News, the N.R.A., and “The Walking Dead.” Militant progressives and pre-apocalyptic conservatives have an outsized presence in our imaginations; they are the bogeymen in narratives about our mounting nihilism. We’ve come to fear each other’s fear. [Rothman]
The lag time from major successes in Deep Learning to generally-socially-concerned research funding like that of the Media Lab / Klein Center has been several years, depending on how you count. We need that reaction time to get shorter and shorter until our civilization becomes proactive, so that our civilizational capability to align and control superintelligent machines exists before the machines themselves do. I suspect this might require spending more than 0.00001% of world GDP on alignment research for human-level and super-human AI.
Granted, the transition to spending a reasonable level of species-scale effort on this problem is not trivial, and I’m not saying the solution is not to undirectedly throw money at it. But I am saying that we are nowhere near done the on-ramp to acting even remotely sanely, as a societal scale, in anticipation of HLAI. And as long as we’re not there, the people who know this need to keep saying it. [Critch]
The title of a paper by the Belgian social scientist Guillaume Wunsch, ‘God has chosen to give the easy problems to the physicists, or why demographers need theory’, accords with a remark attributed to Gregory Bateson, that ‘there are the hard sciences, and then there are the difficult sciences’. Both remarks point in two directions: towards the disciplines paradigmatic of what is ‘hard’, in the dual sense of solid reality and difficult methods; and towards those other disciplines, still in the shadow of the first, which in their relative softness present difficulties (so it is claimed) of a much more demanding kind. The imagery is dubious for other reasons, 46 but within its own frame, what is it saying? When softness is regarded as bad, what is the problem? [McCarty 145-6]
Best paper title ever.
Digital methods can provide new evidence and even new kinds of evidence in support of literary claims, and can make new kinds of claims possible. They can also make some claims untenable. In addition to allowing for “distant” kinds of readings of enormous collections of texts that are simply too large to be studied otherwise, the extraordinary powers of the computer to count, compare, collect, and analyze can be used to make our close readings even closer and more persuasive. Perhaps the availability of new and more persuasive kinds of evidence can also inspire a greater insistence on evidence for literary claims and push traditional literary scholars in some productive new directions. I would not argue that digital methods should supplant traditional approaches (well, maybe some of them). Instead, they should be integrated into the set of accepted approaches to literary texts. [Hoover]
The ‘well, maybe some of them’ is the epitome of scholarly politeness, here, but I’d be a litle less polite. A lot of our traditional approaches are shockingly vague, partial, and wrong-headed, and we should supplant them as soon as it’s technically possible to do so.
Vaillant’s other main interest is the power of relationships. “It is social aptitude,” he writes, “not intellectual brilliance or parental social class, that leads to successful aging.” Warm connections are necessary—and if not found in a mother or father, they can come from siblings, uncles, friends, mentors. The men’s relationships at age 47, he found, predicted late-life adjustment better than any other variable, except defenses. Good sibling relationships seem especially powerful: 93 percent of the men who were thriving at age 65 had been close to a brother or sister when younger. In an interview in the March 2008 newsletter to the Grant Study subjects, Vaillant was asked, “What have you learned from the Grant Study men?” Vaillant’s response: “That the only thing that really matters in life are your relationships to other people.” [Wolf Shenk]
With familiar competitive habits, this growth rate change implies falling wages for intelligent labor, canceling nature’s recent high-wage reprieve. So if we continue to use all the nature our abilities allow, abilities growing much faster than nature’s abilities to resist us, within ten thousand years at most (and more likely a few centuries) we’ll use pretty much all of nature, with only farms, pets and (economically) small parks remaining. If we keep growing competitively, nature is doomed.
Of course we’ll still need some functioning ecosystems to support farming a while longer, until we learn how to make food without farms, or bodies using simpler fuels. Hopefully we’ll assimilate most innovations worth digging out of nature, and deep underground single cell life will probably last the longest. But these may be cold comfort to most nature lovers. [Hanson]
It takes a deliberate effort to visualize your brain from the outside—and then you still don’t see your actual brain; you imagine what you think is there, hopefully based on science, but regardless, you don’t have any direct access to neural network structures from introspection. That’s why the ancient Greeks didn’t invent computational neuroscience. [Yudkowsky]
Access to stored knowledge has changed over time. It was available to only a few when most people were illiterate. Universal education then made such knowledge accessible to almost all. But now we are moving toward a state in which a great deal of knowledge contained in databases is inaccessible to those without the necessary computational skills. This is not the familiar digital divide of economic and educational opportunities but an epistemological division based on a number of different factors. An important barrier is a lack of technical expertise. Degrees of epistemic inaccessibility based on abilities have always existed; understanding contemporary molecular biology is possible only to a limited degree for most knowers. But cognitive limitations are not the only obstacles to access. Proprietary algorithms and intellectual property laws may prevent open access to a database. Although such barriers are complemented by the great openness of information produced (accidentally) by the internet, the evidence suggests that we are now undergoing something like the successive periods of agricultural enclosure in England during the eighteenth and early nineteenth centuries, when common land was enclosed by private landowners. This inaccessibility of information to human agents can also arise because of the sheer size and complexity of the data and the calculations needed to process them, because we do not have the representational means to display that knowledge, or because we cannot construct suitable algorithms to process the data. [Alvarado + Humphreys 740]
With the emergence of “big data” collections, there are too many accessible texts to read each one closely; even if one could read them closely, it is unlikely that one could read them consistently; and if one could read them consistently, it is inconceivable that one would be able to remember even a small percentage of them. Developing a model of “meaning” by applying unsupervised machine learning techniques across the entire corpus might be a solution to this problem. Yet, while this is a worthwhile idea and one not addressed in this paper, such an approach would have limited applicability beyond providing a first level approximation of the general contours of topics in a particular literature at a particular time. Except for encyclopedic projects, most contemporary literary scholarship does not focus on making broad generalizations about a national literature, but rather it emphasizes narrower developments in the literary landscape coupled to a thorough contextual knowledge of the impact and spread of those developments. Analysis of this type is largely dependent on a scholar’s “domain expertise.“”
Literary domain expertise is formed from the study of an imperfect and largely arbitrary canon. We say “largely arbitrary” as matters of reception, sales, publication, circulation, critical reviews and so on contribute significantly to the recognition of a literary work as exceptional. Exceptional works that have “staying power”—that are able to engage critics for a considerable period of time—are those that enter the canon. At the same time, despite the impression of immutability, the canon often changes radically over time so that unknown works can suddenly become known (and canonical), while well-known (and canonical) works can suddenly fall out of favor and disappear from the canon altogether. [Tangherlini + Leonard 726–7]
One hypothesis for why code is more repetitive than NL is that humans find reading and writing code harder to read and write than NL. Code has precise denotational and operational semantics, and computers cannot deal automatically with casual errors like human listeners. As a result, natural software is not merely constrained by simpler grammars; programmers may further deliberately limit their coding choices to manage the added challenge of dealing with the semantics of code.
Cognitive science research suggests that developers process software in similar ways to natural language, but do so with less fluency. Prior work suggests (Siegmund et al, 2014) that some of the parts of the brain used in natural language comprehension are shared when understanding source code. Though there is overlap in brain regions used, eye tracking studies have been used to show that humans read source code differently from natural language (Busjahn et al, 2015; Jbara and Feitelson, 2017). Natural language tends to be read in a linear fashion. For English, this would be left-to-right, top-to-bottom. Source code, however, is read non-linearly. People’s eyes jump around the code while reading, following function invocations to their definitions, checking on variable declarations, etc. Busjahn et al. (Busjahn et al, 2015) found this behavior in both novices and experts; but also found that experts seem to improve in this reading style over time. In order to reduce this reading effort, developers might choose to write code in a simple, repetitive, idiomatic style, much more so than in natural language.
This hypothesis concerns the motivations of programmers, and is difficult to test directly. We therefore seek corpus-based evidence in different kinds of natural language. Specifically, we would like to examine corpora that are more difficult for their writers to produce and readers to understand than general natural language. Alternatively, we also would like corpora where, like code, the cost of miscommunication is higher. Would such corpora evidence a more repetitive style? To this end, we consider a few specialized types of English corpora: 1) corpora produced by non-fluent language learners and 2) corpora written in a technical style or imperative style. [Casalnuovo et al. 6–7]