“Some circumstantial evidence is very strong, as when you find a trout in the milk.”
- Henry David Thoreau
I promise the above is a real quote and not the opening of a Monty Python sketch; Thoreau appreciated that not all evidence is created equal. As a criminal defense attorney, I have something similar, call it Dale’s Addendum to Thoreau’s Milk-Trout Theorem: Just because there’s a dead body in your living room doesn’t necessarily mean you’re guilty of murder… but you should probably have a good alibi or explanation. Like AI handling the image of a trout in milk, it is hard to perfectly describe and circumscribe all of the inferences that are implicated with a dead body in one’s parlor…
Now let me show you how conditional even that kind of fact is: if I told you that the body was a deceased relative and it was being displayed prior to burial, as is the custom in many cultures, like the Irish Catholic, then that changes the implications - or said another way, it changes the set of inferences that you could draw - from “a dead body in your living room.” If, however, instead I say that the body appears to have two large bullet holes in the chest, and I notice when I look up that there’s an empty rack in a case that has space for 4 shotguns, and one is missing… well, you can imagine that if I ask to look around and the Man of the House asserts his Fourth Amendment rights, you - just like any magistrate or judge - would have no problem at all granting a search warrant upon a finding of “probable cause” (that there might be evidence associated with a crime thereabouts on the premises). You don’t even need to know what those magic legal words “probable cause” mean to appreciate the difference in implication or inference between my two above examples.
But how is it that we can all agree upon something like that, the relative strength of implication, and of what different inferences can be drawn from a series of related and/or competing propositions? What is the “rational principle” that’s operating “under the hood,” so to speak? Because it’s embedded in the entirety of the western judicial system, particularly the right to a trial by jury.
Laplace’s qualitative principle is his famous remark that “Probability theory is nothing but common sense reduced to calculation.” The main object of this paper is to show that this is not just a play on words, but a literal statement of fact.
One of the most familiar facts of our experience is this: that there is such a thing as common sense, which enables us to do plausible reasoning in a fairly consistent way. People who have the same background of experience and the same amount of information about a proposition come to pretty much the same conclusions as to its plausibility. No jury has ever reached a verdict on the basis of pure deductive reasoning. Therefore the human brain must contain some fairly definite mechanism for plausible reasoning, undoubtedly much more complex than that required for deductive reasoning. But in order for this to be possible, there must exist consistent rules for carrying out plausible reasoning, in terms of operations so definite that they can be programmed on the computing machine which is the human brain. This is the “experimental fact” on which our theory is based. We know that it must be true, because we all use it every day. Our direct knowledge about this process is, however, only qualitative in much the same way as is our direct experience of temperature.
E.T. Jaynes, “How Does the Brain Do Plausible Reasoning?” G. J. Erickson and C. R. Smith (eds.), Maximum-Entropy and Bayesian Methods in Science and Engineering (Vol. 1), 1-24; p. 2 (bold added).
“No jury has ever reached a verdict on the basis of pure deductive reasoning” is, one may notice, is hinted at in several prior chapters. We’ve come full-circle, in a manner of speaking, and we’re now in a position to discuss this claim in the context of one of the most important trials in Anglo-American legal history: Bushel’s or Bushell’s Case, (1670) 24 E.R. 1006. Despite the name, this was not a trial about apples.
The year is 1670 and England is still very much in the grip of its Anglican anti-Catholic (and truthfully, anti-anything religious that is non-Anglican) fervor. The Conventicle Act of 1664 outlaws… conventicles, which (ahem, as everyone knows) are religious assemblies of more than five people other than an immediate family, outside the auspices of the Church of England and the 1662 Book of Common Prayer. Britannica gives some flavor of the times, but England’s “Anglicanism” became ascendant largely on the power of massive and repeated political persecution of any non-conforming religious subjects over the previous 125 years.
Important Background: In 1534, King Henry VIII wanted his marriage to Catherine of Aragon annulled because she hadn’t given him any sons, but The Pope said “Nope,” largely because the Pope had already granted Henry a Papal exemption to marry Catherine in the first place.1 Catherine had been previously married to Henry’s older brother Arthur when he ascended the throne, but he got all sick and croaked within a year, causing a bit of a mess between Spain and England’s arranged marriage of these teenagers. Catherine also publicly claimed that she and Arthur had not consummated the marriage (IYKWIM…AITYD). It’s not certain, but she had a good enough case because of how sick and enfeebled Arthur almost as soon as becoming King. So Henry ascends after Arthur, and Henry’s lawyer was a guy by the name of Thomas More - yes! that guy! the Catholic patron saint of lawyers and subject of the wonderful if very-historically-selective Robert Bolt play, A Man for All Seasons)2. More helped Henry write the biblical analysis for the petition to the Pope that justified allowing Henry to marry Catherine, under the Levirate marriage doctrine in the Bible. Now just a short 8 years later, however, here comes Henry asking for a rather loud and public divorce, ahem annulment from Catherine, who despite being Spanish by birth was very well-liked by the English public; also seemingly a serious and devoted Catholic. King Henry is (of course) seen cavorting with Anne Boleyn while arguing that Catherine’s barrenness (despite 3 daughters, but no male heirs) was a result of their marriage violating Leviticus, a textual stretch because the distinction in the Levirate doctrine between marrying your brother’s widow (i.e. brother’s dead) vs. the Leviticus proscription of lying with your brother’s wife (i.e. brother’s alive) seems rather morally significant..3 Protestantism had already begun to run amok on the Continent because of some guy named Martin Luther (yep. that dude was in this mess, too). So the Pope took a dim view of all of this (and he had his own problems at the time).
Henry told the Pope to stuff it, created his own Church of England with the King as its head, had Thomas More imprisoned in the Tower, then tried, and thus kicked off roughly 125 years of the British crushing Catholic, or any non-Anglican religious sentiment, in all of England. It should be noted that the Pope at the time, Clement VII, born Giulio de Giuliano de Medici, (YES! Them, too! The 1500s were crazier than people think.) was also cousin to prior Pope Leo X, and of the Holy Roman Emperor, Charles V. At the same time that Clement was dealing with Henry VIII’s “marry-my-sister-in-law-now-divorce-my-sister-in-law” on the Island between 1524 and 1532, Clement had his hands full with the invasion of Italy by France’s Francis I, with Clement siding with Holy Roman Emperor Charles V (Battle of Pavia, 1525). But after Charles captured Francis I, Clement joined the League of Cognac on Francis’s side, which led Charles to sack Rome (1527-28). Catherine also asked to have her marriage validated by the Church (1528 - 1530) in opposition to Henry’s petition to have it annulled, all while Clement was essentially Charles V’s hostage, and Charles very much did not want the annulment granted… because he had his own problems with a Continent starting to be overrun with Protestants claiming they had their own interpretations of the Bible. (Uppity peasants are the worse, amirite!?!)
So that’s where things are when England begins a series of retaliatory acts in the 1660s to stop all of this non-Anglican protestant proselytizing that’s going on.
Enter two Quaker ministers by the name of William Meade and William Penn - (yes! the guy who would go on to found the Commonwealth of Pennsylvania!). Penn was a young man of noble birth; the elder Sir William Penn had been an Admiral in the Royal Navy and a member of the House of Commons. Notwithstanding, the younger Penn got himself tossed into the Tower for the first time in 1667 for violating the Conventicle Act by going to “a meeting of Friends” - i.e. a religious gathering of Quakers. The mayor offered to free Penn on his own recognizance, but the 23-year-old refused and was sent to prison with eighteen others. Penn became a pamphleteer and wrote “The Sandy Foundation Shaken” to refute the doctrines of the trinity and the eternal damnation of souls, which got him tossed into prison again, not technically for his ideas, but because he had no license from the bishop of London to be writing that kind of heresy. While in the Tower of London Penn wrote his most famous book, “No Cross, No Crown.”
Call this the background evidence against Penn: so it was not a surprise to anyone that in 1670, Penn along with his buddy William Meade, were arrested (yet again) in Gracechurch Street, London, for preaching in violation of the Conventicle Act. Penn is basically on his “third strike” as a non-conforming proselytizer. At trial the prisoners appeared before twelve judges and twelve jurors. Witnesses came in and said that between 400-500 people gathered in the street listening to Penn and Meade. Oh, yeah. These guys were way over their 5 person limit. From the trial recording, it’s clear that the government witnesses were what trial lawyers call “reluctant” prosecution witnesses. One tried to recant that he heard them proselytizing, saying instead that he couldn’t hear over the noise exactly what they were on about. And the recorder (prosecutor) has to impeach his own witness with his prior testimony.4
The record from the trial at the Old Bailey is riveting.
Penn challenged the legality of the indictment and refused to plead without seeing a written copy of it; they didn’t give him one, so he pleaded not guilty. The next day in court, after a night in ‘the Hole,’ the prisoners were fined forty marks for failing to remove their hats; Penn tried to pull his best Matt Damon in Good Will Hunting by citing Coke on common law and the rights in the Great Charter (Magna Carta). Despite these arguments, or maybe even because of these arguments, the recorder charged the jury to bring in a verdict of guilty based upon the evidence presented. Four jurors, however, dissented, chief among them Edward Bushell - and, boy, the judges started giving it to him right there in front of everyone, threatening to “put a mark upon you, Sir!” Then the whole jury were sent back to “rethink” their verdict. The link above is worth a read just for the Olde Timey Englishe way in which the protagonists were treated to “barbarous usage”… well, you can just imagine how hard they were givin’ it poor old Ed Bushell. It’s high drama - I can’t imagine why this hasn’t been a full-length movie with Gary Oldman.5
But what’s really fascinating is that the Recorder essentially tells the jury: “Hey, look, we brought in plenty of witnesses to show you these guys had a crowd of four to five-hundred gathered out on the street to whom they were proselytizing. The law says you can’t do that! These guys did it!!” Here comes the important part of the government’s “charge” to the jury: “There is only one possible conclusion - only one plausible inference - that can possibly be drawn from the state of the evidence as it lies before you. Guilty.”
But the jury refused to do it. When they come in the first time after voting, four people, including Edward Bushell, “dissented” from a guilty verdict.
Obser. They [the judges] used much menacing language, and behaved themselves very imperiously to the jury, as persons not more void of justice than sober education: After this barbarous usage, they sent them to consider of bringing in their verdict, and after some considerable time they returned to the Court. Silence was called for, and the jury called by their names.
Cler. Are you agreed upon your verdict?
Jury. Yes.
Cler. Who shall speak for you?
Jury. Our Foreman.
Clerk. Look upon the prisoners at the bar; how say you? Is William Penn Guilty of the matter whereof he stands indicted in manner and form, or Not Guilty?
Foreman. Guilty of speaking in Grace-church street.
Court. Is that all?
Foreman. That is all have in commission.
The jurors delivered what is sometimes called a “special verdict” - offering only the facts they could all agree upon: “Guilty of speaking in Grace-church street.” They refused to add the words of guilt - that it was an “unlawful assembly” - as that is spelled out in the Conventicle Act. The magistrates refused to accept that, and ordered the jury to be “locked up without meat, drink, fire, and tobacco,” while Penn pleaded like Mel Gibson in Bravehart (which is a nice irony) with them not to give up their rights as Englishmen. Penn and all twelve of the jury were sent to prison. Someone, likely Penn’s father, paid the fines, and most were released, but good ol’ stalwart Ed Bushell refused to pay and then sued the mayor and recorder in habeas corpus, eventually winning a historic decision.
In the Case of the Imprisonment of Edward Bushell, Vaughan’s Reports 135, (1670) 24 E.R. 1006, Chief Justice Vaughn analyzed both the government’s arguments and the existing case law surrounding the imprisonment of jurors for “false verdicts” - i.e. where the evidence is “full and manifest” of guilt of “the indicted” and the jurors “knowing the said evidence to be full and manifest” - and yet they do not convict. (Yes, there was already a body of common law on jury-fixing back in the 1600s - I’m tellin’ ya, the 16th and 17th century were nuts.)
C.J. John Vaughan’s rhetorical flourish in response to these arguments helped to establish and preserve (a) the right to a jury trial, (b) the right of a jury to nullify an indictment even in the face of “manifest evidence”, as well as (c) what we know as the Fifth Amendment prohibition against “double jeopardy” - being twice put in peril for the same offense once acquitted.6 And his decision hinges on exactly the subject that began this discourse: what inferences can a jury of ordinary people make with respect to competing propositions on evidence placed before them?
I would know whether anything be more common, than for two men students, barristers, or judges, to deduce contrary and opposite conclusions out of the same case in law? And is here any difference that two men should infer distinct conclusions from the same testimony? Is any thing more known than that the same author, and place in that author, is forcibly urged to maintain contrary conclusions, and the decision hard, which is in the right? Is anything more frequent in the controversies of religion, than to press the same text for opposite tenets? How then comes it to pass that two persons may not apprehend with reason and honesty, what a witness, or many, say, to prove in the understanding of one plainly one thing, but in the apprehension of the other, clearly the contrary thing? Must therefore one of these merit fine and imprisonment, because he doth that which he cannot otherwise do, preserving his oath and integrity? And this often is the case of the judge and jury.
I conclude therefore, That this return, charging the prisoners to have acquitted Penn and Mead, against full and manifest evidence, first and next, without saying that they did know and believe that Evidence to be full and manifest against the indicted persons, is no cause of fine or imprisonment.
Case of the Imprisonment of Edward Bushell, Vaughan’s Reports, 135 (bold added). Vaughan goes on from there to put a stake in the heart of any claim that a disagreeable verdict - absent some other extrinsic evidence of tampering or fixing - could ever serve as the basis for overturning it by the court system. Thus, jury nullification is simply embedded within the system we have.
The requirement for a unanimous verdict on a criminal trial is a way of tilting the scale; if all twelve must agree completely on guilt, the scales are weighted such that the government must extinguish all plausible inferences consistent with innocence in the minds of ALL twelve jurors lest there remain what lawyers call a reasonable doubt about the defendant’s guilt at the close of the evidence. Presumably, someone among the twelve should or would discover and point out the evidentiary lack by their appeal to the same sense of plausible inference that all of the jurors also use.
What Bushell’s case discusses is exactly what Edwin James was talking about in the quote above about plausible reasoning, with Jaynes having the benefit of a few centuries of refinement in the mathematics underlying Laplace’s probability theory, Polya’s exploration of plausibility, Claude Shannon’s work at Bell Labs on “Information Theory”, and R.T. Cox’s Boolean algebra applied to plausibility as values on a number line.
We now turn to development of our first mathematical model. We attempt to associate mental states with real numbers which are to be manipulated according to definite rules. Now it is clear that our attitude toward any given proposition may have a very large number of different “coordinates”. We form simultaneous judgments as to whether it is probable, whether it is desirable, whether it is interesting, whether it is amusing, whether it is important, whether it is beautiful, whether it is morally right, etc. If we assume that each of these judgments might be represented by a number, a fully adequate description of a state of mind would then be represented by a vector in a space of a very large, and perhaps indefinitely large, number of dimensions.
Not all propositions require this. For example, the proposition, “The refractive index of water is 1.3” generates no emotions; consequently the state of mind which it produces has very few coordinates. On the other hand, the proposition, “Your wife just wrecked your new car,” generates a state of mind with an extremely large number of coordinates. A moment’s introspection wil show that, quite generally, the situations of everyday life are those involving the greatest number of coordinates. It is just for this reason that the most familiar examples of mental activity are the most difficult ones to reproduce by a model. We might speculate that this is the reason why natural science and mathematics are the most successful of human activities; they deal with propositions which produce the simplest of all mental states. Such states would be the ones least perturbed by a given amount of imperfection in the human brain.
E.T. Jaynes, “How Does the Brain Do Plausible Reasoning?” p. 3.
Hopefully, I’ve at least made the case that the Law is much concerned with what inferences can be drawn from a given set of propositions - i.e. evidence. Notable omission: both science and legal historians may take me to task because in all of this I’ve left out Francis Bacon, who is arguably the first person to present a comprehensive case for science as induction… and he was also a lawyer. IOW, Bacon should be the very avatar for what I’m arguing, however… I chose to leave Bacon! Well, I have two reasons for this: one, the most straightforward is that I haven’t read his works, though I’m broadly familiar with them, but more importantly is two, which is that it’s a much more difficult job to start with Bacon and say, “This guy was right!” and then have to spend time on what Hume, Popper, et al did in response. Seems like a waste - but the critical historical point is that Hume’s inductive skepticism, as reified in academic science by Popper and Fisher, and even our Supreme Court (as we’ll see), is what won. Yes, there are industrial scientists doing real science to great effect and there is a devoted cadre to E.T. Jaynes’ work, but popular “Science” today is still doing Null-Hypothesis Significance Testing (NHST), despite multiple statements from the American Statistical Association telling all of the U.S. Scientific Community that NHST doesn’t tell you anything at all about your hypothesis.
For p-value defenders, I promise, I will get into this in more detail as we go along, beginning next time when we’ll look a little more closely at the math of conditional probabilities, Bayes’ theorem in some simple examples, and begin our look at the Frye and Daubert Supreme Court decisions as to “What is Science?”
Marrying one’s brother’s widow is an obligation according to a number of verses in the Bible: Genesis 38:8; Deuteronomy 25:5; Matthew 22:24; Mark 12:19; and Luke 20:28. This is known as levirate marriage… however (see FN 3 in a minute).
A more complete truth is that More was also known as a wicked persecutor of non-Catholics, i.e. heretics. During his years as Henry’s Lord Chancellor, More had six Lutherans burned at the stake, and many others imprisoned. More defended the anti-heresy laws and enforced them with zeal. Indeed, you could argue that the Monarchy’s (Henry’s and subsequent) reaction to More set the cause of Catholics in England - and America - back several hundred years. JFK was the first Catholic leader in Anglo-American history since Henry VIII was made fidei defensor “Defender of the Faith,” by the same Pope who later excommunicated him.
…Leviticus 20:21 says “And if a man shall take his brother’s wife, it is an unclean thing: he hath uncovered his brother’s nakedness; they shall be childless.” There is also a long list of forbidden sexual behaviors, including Lev. 18:16, “Thou shalt not uncover the nakedness of thy brother’s wife: it is thy brother’s nakedness.”
Presumably, Penn and Meade would have been okay if they’d been advocating about some kind of secular issue contra the King’s interest and drew a crowd of 4-500 for that…. Bwahahahaha!! No, they wouldn’t have. That’s what’s funny about all of this regardless.
Instead we get to live forever with seeing his nekked butt as Rev. Arthur Dimmesdale in “The Scarlet Letter.”
That was exactly what happened to Penn and Mead - the jury came back and said not guilty because they only had 8 votes to convict; the judges said “incorrect” and now the government had a second chance to get a verdict of guilty by the second (re-)vote.
I don't see any problem with p-values as long as they're used to justify reporting effect sizes (instead of just any effect size no matter how small).
Great, as usual. I had no idea about that trial. Fascinating.