I had a rather exhausting discussion with Sanjeev Sabhlok in the comments of this post on whether reservations (or more generally affirmative action) contradicts basic principles of justice.
The only good thing that came out of that discussion (from my viewpoint) is it prompted me to glance through some pages of Friedrich Hayek’s Law, Legislation and Liberty. Specifically, I looked at his chapter on Social Justice. The entirety of my exposure to Hayek’s ideas is that one chapter, so I’m quite happy to be corrected.
It’s not hard to see why Hayek is held in high esteem. This small chapter covers a wide variety of issues, many of which I think are relevant to the issue of reservations and affirmative action. I find Hayek’s writing style hard to read, since his passages sound ambiguous to me. It’s probably true that he had a precise position on the issues, but his book isn’t written that way and feels somewhat open to interpretation.
I don’t claim to understand Hayek’s theories very well, but some things jumped out at me. Hayek says that we have the right to set rules, but to expect that once the rules are set and the market is set in motion, it is pointless to speak of the justice of the outcomes of the market (as long as everyone follows the rules).
This is an interesting idea, and curiously it parallels Krishna’s “Karmanyeva adhikaraste…“! We have the right to decide the rules of action, but not to decide the outcomes, which are subject to much randomness! It is widely accepted that attempts to make sure everyone has what they need is socialism; indeed, that is often treated as the definition of socialism. Hayek says that attempts to make sure everyone gets exactly what they deserve is still socialist. He calls this social justice (or rather, says this is what others mean by social justice). But Hayek goes a step further in thinking about this, and makes two apparently contradictory statements.
First, Hayek seems to agree that laws should not be predictably biased towards or against a segment of the society. This I think is fascinating and in fact a crucial consideration while framing laws. At the time we frame a law, it should not be predictably unjust.
Elucidating what “predictable” means here is an interesting exercise in itself. My interpretation is the following. A predictable set of people is at time t is a set that is determined by events occurring up to time t, and not after. Thus “Dalits in 2010″ is a predictable set in 2010, but “millionaires in 2020″ is not very predictable in 2010. However, “millionaires in 2010″ is of course predictable in 2010. As with all social things, a certain level of fuzziness in defining sets is probably convenient. The set “billionaires in 2015″ is probably predictable with 99% accuracy in 2010, although the set “millionaires in 2015″ is much less predictable. Allowing for this slight fuzziness, “Dalits in 2050″ is a predictable set in 2010. (People can and do “change their caste”, often through birth certificate fraud, but so few do it that the set is almost determined in 2010.) Have I defined predictability precisely? Not mathematically. But it seems precise enough for society, law and justice.
When we pass a law that increases the relative advantage or disadvantage of a predictable set of people to its complement, we are doing something wrong. Thus, if we pass a law in 2010 that will widen the advantage gap between, say, the 40-th and 60-th wealth percentile of the population in 2020, the principle of predictability does not prohibit this (since these percentiles are not very predictable sets). On the other hand, if we pass a law that will widen the advantage gap between the blind and the not-blind in 2015 (a moderately predictable set in 2010), there is something wrong with that law. Similarly, if we pass a law in 2010 that increases the advantage gap between Dalits and non-Dalits in 2050 (a highly predictable set), there’s something wrong with that law. This is the gist of my application of Hayek’s predictability criterion to the affirmative action case.
Of course, it’s not quite as simple as that. It’s not enough to say widening the gap is bad and narrowing it is good — we should also worry about whether things are getting better for everybody. If we pass a law that predictably reduces everyone to abject poverty, this might reduce the gap — but it’s not what we want. On the other hand, passing a law that predictably makes Dalits remain poor while increasing most non-Dalits’ wealth is also obviously wrong — even though it is true that some people have gained, and no one has been harmed (relative to where they started off). Also, most outcomes will invariably be biased if a short enough time-frame is chosen. For example, it’s certainly true that, no matter what anyone does, those who are poor on Jan 1, 2010 will overwhelmingly remain poor on Jan 2, 2010 — or for that matter on Jan 1, 2011. Since ALL laws are biased, should we refrain from passing any laws? A reasonable time frame has to be attached to the term “bias”.
Thus, this is a very loose principle — laws will need to balance fairness, considerations of where people start off, practicality, enforceability, acceptability in society, timeframe and a number of other factors. Indeed, I think it’s not possible to state a succinct, simple principle that can be the sole guiding principle behind all laws, or even identify all the factors that need to be considered.
Equality of Opportunity
Now, it seems as if equality of opportunity is a natural consequence of this concept of no predictable bias. After all, if there’s no predictable set of people that is better off than another, isn’t this the same as saying that the law is equally unbiased towards everybody? It seems any principle for framing laws should lead to laws that give everyone the same opportunities, even though various random events would likely lead to differences in final outcomes.
This is where Hayek seems to make a contradictory statement. Hayek says that equality of opportunity is also a socialist ideal — not a-priori, but in its implications:
To achieve this government would have to control the whole physical and human environment of all persons, and have to endeavour to provide at least equivalent chances for each; and the more government succeeded in these endeavours, the stronger would become the legitimate demand that, on the same principle, any still remaining handicaps must be removed-or compensated for by putting extra burden on the still relatively favoured. This would have to go on until government literally controlled every circumstance which could affect any person’s well-being.
This sounds correct — it is obviously impractical to demand that the government provide perfect equality of opportunity to every single individual. But Hayek himself says that
So far as [equality of opportunity] refers to such facilities and opportunities as are of necessity affected by governmental decisions (such as appointments to public office and the like), the demand was indeed one of the central points of classical liberalism, usually expressed by the French phrase ‘la carriere ouverte aux talents’. There is also much to be said in favour of the government providing on an equal basis the means for the schooling of minors who are not yet fully responsible citizens…
It seems that equality of opportunity is perhaps not inevitably socialist or classical liberal, but rather a mixture of the two tempered by the extent to which it is practical. That is, government should endeavour to provide equality of opportunity up to the point where it has to start taking socialist actions like controlling people’s lives. The line between providing equality of opportunity and socialism is blurred — so blurred that it’s silly to pretend there’s a line (my thoughts, not Hayek’s).
How does all this tie in with affirmative action? The principle that laws should not be predictably biased would seem to indicate that affirmative action is necessary. The current system is extremely harmful for Dalits and certain other backward classes. Indeed, the state completely failed them for several decades, a situation that is only now starting to be rectified. Under current laws, and under any law that completely denies all forms of affirmative action, Dalits will predictably be disadvantaged and continue to be punished by the system for several decades.
It is important to note that this reasoning does not apply to every group that is disadvantaged. If a Muslim and a Brahmin are equally smart and study in the same class in the same school (I’m establishing ceteris paribus here), I think the Brahmin has no advantage compared to the Muslim. They are equally likely, or almost equally likely, to find good jobs. In addition, opportunities available to Muslim and Brahmin kids are the same modulo their own beliefs. That is, if a community of Muslims chose to reach out and accept the available opportunities, they would be no worse off than a community of Brahmins. The same is not true for Dalits. There are active as well as passive forces arrayed against the Dalits.
Thus, Hayek’s own notion of not predictably harming someone via legislation seems to support the idea of affirmative action for Dalits.
The important question whether this can be classified as actively harming non-Dalits. I don’t believe so. Increasing opportunity for Dalits in this way certainly decreases opportunity for non-Dalits, but opportunity was lop-sided to begin with, and the lop-sidedness continues to be maintained using marginally legal methods. With affirmative action, entrance into various lucrative positions becomes tougher for non-Dalits, but still not as tough as it is for Dalits.
My Position on Affirmative Action
For the record, my own position is a guarded support for certain forms of affirmative action in the short term.
I think it’s important to base affirmative action not only on caste, but on as many major sources of predictable variability as practical. This is the topic of the MIRAA score discussed in my other post.
I also believe affirmative action is nothing but a temporary pressure valve measure to quickly correct certain imbalances. It is no substitute for free, high quality universal education. Education, not affirmative action, should be the method of choice for ensuring equality of opportunity. Education is the only useful very-long-term sustainable means for equality of opportunity. The only reason for affirmative action is that it seems impossible to equalize “predictable opportunity” using education alone in the next 30 years.
Karl Popper, in his 1953 lecture on science, described three pseudo-scientific theories thus:
I found that those of my friends who were admirers of Marx, Freud, and Adler, were impressed by a number of points common to these theories, and especially by their apparent explanatory power. These theories appeared to be able to explain practically everything that happened within the fields to which they referred. The study of any of them seemed to have the effect of an intellectual conversion or revelation, opening your eyes to a new truth hidden from those not yet initiated. Once your eyes were thus opened you saw confirming instances everywhere: the world was full of verifications of the theory. Whatever happened always confirmed it. Thus its truth appeared manifest; and unbelievers were clearly people who did not want to see the manifest truth; who refused to see it, either because it was against their class interest, or because of their repressions which were still ‘un-analysed’ and crying aloud for treatment.
I think Marx and Freud specialized in selling beautiful theories. Their ideas had (have) a simplicity and basic unity that people found (still find) attractive. Although the ideas were elucidated in pages upon pages of writing, there seemed to be some common underlying principles that were elegant and simple, yet possessed of wide explanatory power. The ideas were beautiful.
A lot of people mistake beauty for truth. I first came across this theme while reading blogs by self-styled economic liberals. It seemed to me that many of these liberals are excessively concerned with coming up with pithy one-liner descriptions of reality. A pithy, attractive, succinctly stated “theorem” that allows them to conclude major things or explain a wide variety of phenomena impresses readers and raises cachet. The problem is that the world usually doesn’t admit such oversimplified explanations, but some people are so in love with their one-liners that they continue trying to shoehorn every fact to fit such theories. The problem isn’t just related to blogs; it’s widespread at all levels.
In fields like mathematics, beauty is an asset, because it’s often obvious what’s true. (Obvious in the sense that a trained mathematician can, with sufficient work, in most cases, correctly determine whether an argument is correct or not.) In the other sciences, carefulness in establishing the truth usually trumps the coolness factor. However, economics exists on the dangerous boundary between storytelling and empirical truth. Economic theories are often grand, sweeping — and aren’t subject to the kinds of simple tests scientific theories are. So beauty is an attractive feature of a theory, sometimes the most attractive feature. This is perhaps why Karl Marx’s views are so attractive: they give a beautiful explanation that fits the facts. Economists themselves recognize the problem. I took the phrase “mistaking beauty for truth” from an article by Paul Krugman.
Scientists, like economists, are not immune to the fallacy. Mathematicians often strive for elegance in their proofs, but they are supposed to. Other scientists, however, may get swayed by such considerations too. An interesting example of this is the canonical explanation for the reason moths are attracted to light. For a long time, this was supposed to be due to confusing lights for the moon. The theory went that moths fly in a straight line by keeping a constant angle to the rays of the moon. The moon being a faraway object, its rays are almost parallel by the time they reach the earth. So this is a good approximation for the moth; it wouldn’t deviate much from a straight line in a flight of several miles if it kept a constant angle from the moon.
However, if a moth sees a light that is much closer by and mistakes it for the moon, the situation is quite different. The rays are no longer parallel, and keeping a constant angle to these rays will always result in nonlinear motion. If the moth maintains a perfect 90-degree angle with the light rays, it will fly in a circle. However, the theory goes, moths tend to maintain an acute angle to the light rays. This causes them to move in a spiral that leads inward toward the light. It’s fairly easy to calculate an equation for this spiral. The polar equation of the curve followed by the moth when it maintains an angle of to the rays, parametrized in terms of the angle subtended at the origin between the current position and the positive X axis, starting at the point , is given by
Here’s what it looks like in 2D:
Beautiful. The only problem is, it isn’t true. This isn’t why moths fly towards light, and that was shown by the first careful experiment to be done. Looking at the above graph, the moth circles the light several times before falling into it. Henry Hsiao, a biomedical engineering researcher, studied moths’ flight patterns and found that they don’t fit this behaviour. His alternate theory has to do with mach bands. But why did this theory about the equiangular spiral survive so long? It seems to me its mathematical beauty trumped considerations of its veracity. It’s just another example of beauty being mistaken for truth.
Humans have very little body hair compared to most other primates. There are several theories on this, summarized well and interestingly at http://www.nytimes.com/2003/08/19/science/19HAIR.html, which makes for a very interesting read. The currently accepted theory is that it’s an adaptation that helped avoid diseases spread by lice and ticks which found hiding places in fur.
This immediately raises the question: why didn’t other animals lose their fur for the same reason? That looks like an easy question to answer. Lice and ticks are not the only factor. For example, in cold climates man could survive by wearing the fur of animals he killed. Animals couldn’t do any such thing, so having a fur pelt was much more important than getting rid of lice. This sounds like a plausible explanation, but it is wrong. The article says that man lost his body hair about 1.2 million years ago (estimated assuming the tick theory holds), but only started wearing clothes about 50,000 years ago — so there was a period of several hundred thousand years when man was hairless and naked. Other animals (at least those living in the same regions as man) could have lost their fur during this time, but didn’t. I don’t have a good explanation.
All this changes if we consider a different theory for the loss of body hair. There are many we could consider, but one of the more fanciful theories is that of the Aquatic Ape, which contends that, sometime in our remote past when all humans were still living in Africa, all the ancestors of currently extant humans were trapped in a small region around a shallow sea. According to the theory it’s possible there were other humans not trapped this way, but they all died out. Eons later, another cataclysm scattered these humans, who had now developed adaptations like the loss of fur to be able to swim better, as well as a slight webbing between their fingers, and a host of other adaptations. To me this sounds more like science fiction than fact, and it has little acceptance in the scientific community. But it’s a fun theory.
The November 2006 issue of New Scientist carried a series of articles entitled “The Big Questions”. One of the articles was titled “Do We Have Free Will?“. The same question was asked in a New York Times science section article on 2 Jan, 2007 entitled “Free Will: Now You Have It, Now You Don’t“. So, why are people asking the question? Isn’t it obvious that free will exists?
The question arises because all humans are, after all, collections of molecules obeying physical laws. In that sense, anything and everything we do is simply pre-determined by the laws of physics. We cannot have free will because we are bound by the physical laws governing our molecules. There is nothing “free” about our will. There is no such thing as choice, and consequently, no such thing as free will.
The New Scientist article cites the example of a person who was turned into a deviant by a brain tumour. When the tumour was removed, he became normal again. When the tumour later re-grew because a portion of it was missed out, the man exhibited the same deviant behaviour again. The man did not choose to bad things; he was simply a slave to the physical processes leading to his deviant behaviour.
Why should we care? How does the question of whether we have free will affect our everyday life? To borrow one clear example from the New Scientist article, in a situation where a disease results in deviant behaviour, should the person be punished? Most legal systems are based on an assumption of free will. A perpetrator is punished because he or she is responsible for the crime in the sense that that he or she willfully committed the crime. This entails an exercise of free will. Thus, those who can demonstrate an inability to exercise free will at the time of perpetration of a crime are those who are sentenced leniently or even let off unpunished. Now consider what would happen if everyone has an inability to exercise free will. Would no one be culpable for any crime in that case?
I think free will exists only as an illusion. We don’t really have free will, and we are in fact slaves to physical laws. When we think we are making a decision, we are in fact only obeying the dictates of the laws of physics.
However, the illusion of free will is both useful and consistent. It is consistent because we cannot predict in advance what will happen. It is useful because we can use it to make sense of the way human societies organize themselves. This needs some elaboration. Maybe in a later post.
Free Will is Really a Question of Epistemology
Free will may have more to do with epistemology than physics. Suppose the universe is indeed deterministic in the collection of rules sense. That is, there is a set of transition functions that tell us how to calculate the next state of the universe given the current state of the universe. However, we still lack complete knowledge of the future because we don’t have access to these transition functions. Besides, we may not be able to apply the transition function.
First, we don’t know the entire current state of the universe, which is required as an input to the transition functions.
Second, suppose we grant that the first point is not an obstacle. To answer a particular question maybe we only need a partial state of the universe which we actually know. However, applying the transition function to calculate the next state may not be in the causal sequence for us. That is, the transition functions themselves may not predict that we will apply them and find out the future.
So that’s why we don’t know the future. Now the crux of my argument is this: that what we term “Free Will” is simply a statement about our state of knowledge. When we are faced with two choices A and B, and we say “I made choice A out of my own Free Will at time 2″, what we are really saying is “At time 1 I didn’t have the knowledge/computational resources/ability to know for sure that I was restricted to choice A at time 2″. Since we don’t have knowledge of the future’s certainty, we don’t feel it is certain. We feel that we have made a choice.
In this sense, Free Will is an illusion. It is not that we could have made choice B; it is just that we feel we could have made choice B. The feeling is Free Will, and it arises from the fact that, as part of the universe, we cannot calculate at time 1 what the choice at time 2 will be, or even see that the choice is fixed (since we are not intuitively aware of the transition functions). Applying intuition, since we cannot calculate the choice or even perceive the transition functions, we assume that “we could have made” another choice.
I learned today of some absolutely fascinating interactions between Popperian critical rationalism and the theory of causality. The counterfactual theory of causality is to scientists probably the most useful of the causal theories. The way in which Popperian philosophy enters is in the meaning of counterfactuals and whether they are related by a so-called Structural Equation Model (SEM).
To illustrate, suppose we observe and are interested in three variables, . We know that comes before and chronologically, and comes before . Let’s talk about counterfactuals first. We observe , but suppose we now ask ourselves what would have happened to if, instead of letting nature calculate , we intervened to set it to some value such as . We denote this new, imagined variable by . We might also be interested in the counterfactual variable , which is the new variable that “would result” if we could intervene to set to 0. Counterfactual theories assume that variables such as “exist” and are “available to nature”, and that if actually happens, then nature responds by producing this version of . If, instead, occurs, then nature produces the entirely different variable .
Let us try to reconcile the existence of such counterfactual variables with common views of determinism. If we tried to imagine a process by which nature “calculates” the variables , we’d be tempted to come up with something like this: . Here are three independent random variables (often called “noise”). So nature first calculates from some noise variable using a secret process (just a function, really, but unknown to us). Then it calculates from the previously calculated value of and the noise variable using the function , and finally calculates from using .
Such SEMs are very useful. They can be represented graphically and analyzed mathematically to answer a great many questions in causality, leading to a rich theory developed by Judea Pearl, among many others. An additional bonus is these models account for counterfactual variables in a very natural way. Suppose, for example, that we are interested in the counterfactual , that is the version of that results if someone intervened and set the variable to . (The mechanism by which this might be done is not relevant here.) Nature then simply computes the function . Another example is the counterfactual , which means that is first computed as , and then . The counterfactual is naturally defined as . Thus counterfactuals have very natural semantics in the SEM setting.
The SEM is also a very natural model to humans. Much of what we perceive as Newtonian physics and determinism works this way. Event happens, this influences event (but doesn’t determine it perfectly, which is why is necessary), and so on. Indeed, when I ask myself what other models of the universe might exist (and this question excludes quantum uncertainty and other similar weirdness), I am unable to conceive of a process which doesn’t boil down to an SEM of the type shown above. Most people, when asked to imagine a process by which the universe “creates” events, will probably come up with an SEM.
Perhaps unsurprisingly, this kind of model is inadmissible in the Popperian view, since it uses an underlying justificiation. Intuitively, the structural equations justify the counterfactuals. Delving into things a little more deeply, it turns out that this simple model makes a large number of assumptions that are not verifiable, and so are inadmissible in the Popperian view. But we need a specific example.
A Weird Counterfactual
Specifically, consider the “cross-world counterfactual” . Mathematically, there is no problem with this definition; the function is evaluated at which are themselves well-defined constants or random variables. The problem is with interpretation and observability. Any of the previous counterfactuals could be obtained in a natural way. To observe , set to 0, let nature determine and observe the resulting . To observe , set both to and to and observe the corresponding . But the new quantity is fundamentally different; it involves both (the first argument of ) and needed to observe . This means what we’re trying to do is the following complicated procedure: first intervene and set , obtain the random variable , and then turn back the clock, and now intervene to set as well as set , using the observed before turning back the clock.
Why Critical Rationalism Excludes SEMs
But of course, we can’t really turn back clocks. Thus the weird counterfactual above is unobservable; there should be no way for us to evaluate it under Popperian critical rationalism. If we can evaluate it, it means we are using a rich justification with implications that cannot be verified experimentally. Now, under the SEM it can be shown that
Since all of the quantities on the right hand side are based on observed variables, it means this cross-world counterfactual, which we “couldn’t possibly identify” without turning back the clock, is actually identifiable — a paradox!
What’s the resolution to this paradox? Simply that the SEM makes so many hidden assumptions that we can actually identify it. This explains why SEMs don’t conform to Popperian thought, and why a lot of work focuses on causality and counterfactuals without structural equations.
I noticed a statement today on the box of the cereal I was eating for breakfast today. It said something to the effect that
1. Eating this cereal may reduce your risk of heart disease.
It was starred, of course, and the fine print said something to the effect that
2. Diets low in saturated fat and high in whole grains may confer a reduced the risk of heart disease.
The problem is this: the first statement makes it appear like the cereal is actively doing something in your body to combat heart disease. That is not what the second statement says (at least based on how I interpreted it). The second statement says that compared with diets with higher levels of saturated fat and (presumably) refined flour instead of whole grains, the risk of heart disease is lower.
Thus if you add the cereal to your diet, all other things being equal, there’s nothing in the second statement that tells you that risk is reduced. Yet the blurb implies it is.
Of course all other things may not remain equal. Eating the cereal will probably reduce your appetite, causing you to eat less of the unhealthy stuff. This implies that eating the cereal does have a causal effect on heart disease.
It all depends on how the majority of people interpret the first statement. Does it make them feel as if they can continue to eat unhealthy food and treat the cereal as an antidote? Or is it obvious that replacing another, less healthy food with this cereal is what will provide benefits?
Science has a strange fascination. Everybody would like to lay claim to it. Those who follow the scientific method, the scientists, are of course the practitioners. But even those whose work has less connection to science do. Some of the most insidious of these are the medical hacks, the creationists (a.k.a. intelligent design proponents), the “Christian Science” folks. But there are other relatively harmless co-opters, such as some science fiction authors and filmmakers.
There’s a general feeling that science automatically implies truth, and everybody wants to claim their version is the truth. However, I think science only implies truth under some assumptions extraneous to the scientific method. So I’m going to provide an example of a situation under which science fails to provide the truth.
The best way to explain what I mean is to assume a model for the universe that is consistent with everything we observe. This model is as follows: suppose that the entire observable universe is really a simulation on a computer. Everything has been programmed by somebody. We are not directly aware that we are simulations because, of course, we are part of the simulation. The programmer may have coded some “natural laws” into his simulation and then let it run. The programmer may or may not intervene; if he does, we call it a “miracle”. In this case, the truth is that there is a god — the programmer.
Science is a method, or maybe a collection of methods, for understanding things we observe (inside the simulation). The method is supposed to yield an explanation (theory/hypothesis/law of nature)for observed phenomena. The primary tenets of the scientific method are that
1. the phenomenon must be verifiable. That is, it must be observable by multiple independent observers. It isn’t enough if one person claims the phenomenon occurred.
2. the explanation must be testable or falsifiable. That is, it should be possible to make predictions which, if false, disqualify the explanation — and which can be tested experimentally. If I claim that God is the explanation, this isn’t falsifiable. There is no clear prediction anyone can make which, if false, conclusively proves that God doesn’t exist. This is closely related to the explanation being “in-universe” i.e. a “God” explanation doesn’t satisfy these requirements.
3. the results of experiments to test these predictions must be verifiable and reproducible. That is, multiple independent observers must be able to reproduce the results.
4. the explanation should be simple. That is, a complex explanation would be rejected in favour of a simple one, until the simple one fails. This pertains to what we mean by an “explanation” or law. If we didn’t include this requirement, every phenomenon would suffice as its own explanation — not a very useful state of affairs.
Science is inherently a negative discipline. You can never prove any law using the scientific method.
Even if an explanation (i.e. a natural law) is true, it is never possible to prove scientifically that the explanation is true. All you can do is collect evidence for it. Every experiment which fails to disprove the explanation is evidence for it.
On the other hand, if the explanation is not true, even one negative experiment is sufficient to disprove it.
Science and Truth
Can science unravel the existence of the programmer?
The answer is yes — only if the programmer designed the universe that way.
On the other hand, if the programmer coded the simulation so as not to reveal his existence to us, there’s nothing we can do to discover him.
In the second situation, the truth is there’s a programmer. But there’s no scientific way to discover that. In this situation, science cannot decipher the truth. In fact, there is no way to show that science leads to the truth.
In other words, science can be agnostic to the truth.
But in fact, it gets worse. It is quite possible that the programmer coded the simulation to be deliberately deceptive. Let’s say that whenever anyone applies the scientific method to some particular question, there’s a subroutine that determines this and feeds the scientist false results designed to make him think something that is not true.
In this case, the scientific method would always yield the wrong answer to that question.
“The Fundamental Axiom of Science”
The scientific method is the only logically sound method available to us. It is the best we can do. However, our best may not be good enough. Our best method may not be enough to decipher the truth.
This is a sort of fundamental axiom of science: that the scientific method never leads to untruth. More specifically the axiom is:
If the scientific method shows that some explanation is false, then the explanation is indeed false.
This is an assumption. We can’t test it, because the programmer may not have designed the universe so that it is testable. If the assumption holds, the scientific method is valid. But not otherwise.
This is why I think that, while science will never accept god, science is actually primarily unconcerned with god. There need not be a contradiction between religion and science — until religious types start claiming they are scientific.
The issue of the legality of polygyny has some political overtones in India, with parties like the BJP calling for a uniform civil code and other parties like the Congress saying that Muslims should be allowed to marry multiple women. I had a pretty interesting discussion recently with a friend.
The idealized question was: what’s fundamentally wrong with polygyny? Can we give a rational (not religious) reason why it should be outlawed? If both the men and the women are willing, why should the government (or indeed any religion) impose a ban on it? Here’s the answer I came up with.
To try to give a reason, we fixed on some assumptions. We have to clarify what we are trying to achieve and also what kind of society we are referring to. We assume that:
- men and women have complete freedom to accept or refuse polygynous or any other marriage arrangements (though this is patently untrue in many societies)
- polyandry is not permitted
- in the population, the number of women is not greater than the number of men
- we measure individual satisfaction solely by the ability to find at least one partner (a person is satisfied if (s)he has a partner and unsatisfied if (s)he doesn’t have a partner)
- You could say that this is unrealistic because a man will be more satisfied if he has more wives. I think the outcome of my arguments isn’t affected if, instead of a zero-one satisfaction based on having a single partner, we have a law of diminishing returns based on the number of partners. That is, each additional wife adds less satisfaction than the previous one. But for simplicity, I will argue based on the zero-one satisfaction function
- This is also unrealistic because satisfaction might be based on religious reasons rather than the ability to find a partner. Simply declaring polygyny legal might give satisfaction to the entire population (or a large section of it) for religious reasons. We assume that this type of satisfaction can be ignored, though it may be important in reality.
- the wealth of a man is the most important factor in determining how many wives he has
- the goal of any legislation is to increase the average satisfaction for the people
Under these assumptions, if polygyny is allowed, what will happen is that the wealthiest men will have a larger number of wives and the poorest men will be unable to find wives. At a macro level, we might guess that the wealthiest 10% of the men might marry 40-50% of the women. The poorest 10% of the men might only be able to marry maybe 1% of the women. Since there are fewer women than men, all women will find a husband.
Since there are many more poor men than rich men, this means that there are many men without partners. This leads to a low level of satisfaction in the population.
On the other hand, if polygyny is banned, there is a much larger pool of women available to the poorest 10% of the men, and a correspondingly larger number of men with wives. So the level of satisfaction is much higher.
This was a common question we asked ourselves during high school, when history seemed to have no uses and entailed an interminable sequence of facts that had to memorized. (History’s cause wasn’t helped by the boring way in which it was taught.)
Once you grow up, it’s kind of obvious that history is important. It gives you a perspective on current events, helps assess the effects of policies and actions, and observing successful people builds character and keeps the nation honest. I just wonder why no one told us this in high school!
In a previous post, I said that determinism and causality are incompatible. The argument was that in a universe where everything is determined at every time point, it makes no sense to speak of alternatives or counterfactuals of the sort: If A had happened instead of B, then C would have happened instead of D… . A could never happen, so we are predicating on something impossible: logically, “if A happens instead of B” is like “if 1 = 2″; if that happens, every statement is vacuously true.
I’ve since rethought my ideas about this. The issue is the exact definition of determinism. I used an unusual definition, so I got the unusual result that causality and determinism are incompatible. Here’s my attempt to structure my current understanding of determinism.
It appears that a definition of determinism is contingent on how we describe the evolution of the universe over time.
Let’s assume that at each time point, a system can be described by a state which belongs to some prespecified set of states. A system here could mean something like the universe, and a state could be a position and momentum for every particle in the universe (assuming the universe only contains particles). The set of states is the collection of all possible configurations of particles in the universe. In a simple universe containing only two particles that move only in one dimension, a state would look like ((p1, m1), (p2, m2)), where p1, m1 are the position and momentum of the first particle, respectively, and p2, m2 are the position and momentum of the second particle, respectively. p1, m1, p2, m2 are all single real numbers here. (If we were in 3-d space, p1, m1, p2, m2 would be vectors like (x, y, z).)
Definition of Evolution of the System in Time and Definition of Determinism. When we speak of a description of the evolution of the system over time, we are not talking about what actually happens in the system (unless, of course, the system is deterministic in which case what actually happens is the same as what might happen). Instead, we are talking about potential occurrences i.e. predictions about the future. We might say that if the system is in state X at time 1, it could be in either state Y or state Z at time 2. This does not mean the system will be in both states Y and Z simultaneously at time 2. It means that the system will be in one of those two states at time 2; we don’t know which one. The description is from the viewpoint of an extra-system observer, unaffected by the system’s timeline, who knows what might happen, but not what actually will happen (unless the universe is deterministic).
To describe the system’s evolution, we have to provide an evolutionary tree of some sort. It might be of the following form:
That is, a tree depicting potential states at each time point. Providing a particular evolutionary tree with X1 being the actual (observed) state at the beginning of the universe is one option. Providing “transition functions” that specify what the potential states at the following time point are given any state at a particular time point is another way.
- [Invalid Definition] (E1) Suppose a description of the evolution of the system over time consists of a single evolutionary tree. (D1) A system is deterministic if the state of the tree has no forks; otherwise it is non-deterministic.
- [Valid Definition] (E2) An alternative description of the evolution of the system over time consists of a collection of transition functions (“the laws of physics”). Given any state at a time, the transition functions can be applied to calculate the possible states of the system at any point in the future (i.e., calculate an evolutionary tree starting at that time point). (D2) Here we would say the system is deterministic if there is exactly one state possible at every point in the future (i.e., the calculated tree has no forks).
These two definitions look almost identical. However, the first definition only specifies one possible tree rooted at X1. The second definition lets us calculate the potential states at time 2 once we know the actual state at time 1, using the transition functions. In other words, we can substitute another state, say X8, at time 1 and still compute what possibilities that universe would have.
Even in the deterministic case, they are different for the above reason. The second definition is constructive and so tells us what will happen in the case of interventions. That is, if an external (from outside the system) agent sets the state of the system to some value at some time point, the second definition allows us to calculate the new states in the future of that time point. The first definition doesn’t. For example, in the figure, if the state at time 2 is X2, we know that the possibilities for time 3 are X4 and X5. But what if the state at time 2 is X6? Under (E1) we have no way of knowing; under (E2) we can calculate the possibilities.
I had a very interesting discussion yesterday about whether the concept of the state (i.e., country) is now obsolete. The basic premise is that the world is flat, and that national boundaries are irrelevant in the current global economy. The arguments were roughly along the following lines:
- Corporations Corporations act in ways that benefit people of all countries. The basic unit of society should be the corporation, not the nation. An American country that lays off people in America frees them up to do better, more imaginative, more creative, more cerebral work. The same company, which hires replacements in India, improves the lives of those Indians, who would otherwise have been unable to find work that paid them so well.
- Brain Drain The argument was be taken further: brain-drain is not really a drain at all, because national boundaries don’t matter. Thus top brains and talent moving from India to the US is not a concern. It is better to use your brains in the US than to underuse them in India. And India benefits from this: foreign remittances to India are higher than to any other country in the world.
- America There is only one country in the world, the USA, which has an inherent culture of innovation and discovery. (Or perhaps two or three others at most, Germany being a possibility.) This is why no innovation happens in India, and cannot happen in India — because the people, by nature, lack innovativeness.
- India India, more than any other place, doesn’t deserve nationhood because of the diversity of its people. An Indian feels like a stranger in a different part of his own country. The US feels more like home than India.
I didn’t agree with these points. My answer yesterday to the question: “What is the point of nations?” was “Bargaining power”. Here’s a Q & A:
Q01: What is the point of nations?
Ans: Bargaining power. A nation is nothing more than a collective that bargains in order to increase the standard of living (SoL) for its citizens. It is the same concept as that of a workers’ union.
Q02: What is the point of nationalism?
Ans: The reason a citizen should support his nation (and the concept of nationhood) is that it increases his chances of a better SoL. Nationalism increases a nation’s ability to bargain, by increasing the nation’s unity.
Q03: Then why shouldn’t everyone in the world pledge their loyalty to those nations that have the highest chances of improving their citizens’ SoL? Specifically, the USA?
Ans: If an individual’s goal is to increase his SoL, he should indeed attempt to become a citizen of the country most likely to increase its citizens’ SoL. The reason this doesn’t happen in practice is countries like the USA realize it is not in their best interest, and have laws in place to prevent easy access to citizenship.
Q04: Which laws?
Ans: To become a citizen, one has to demonstrate both competence (through employability) and American nationalism (through a test and residence). America realizes that notions of the world being flat (in the sense of nonexistent national boundaries) are not in its best interests.
Q05: Why is “no boundaries” not in America’s best interest?
Ans: For Americans to remain prosperous, there needs to be a vastly larger population of non-Americans. There needs to be someone to bargain with, someone to exploit.
Q06: Huh?? Why? What do you mean by “exploit”?
Ans: American power has many immediate reasons, but it can be traced back to a form of imperialism. America’s prosperity relies on the exploitation of non-Americans, just as the prosperity of every other major power throughout history relied on exploitation of other populations. Unless a vast population of non-Americans exists, it will be impossible to use America’s bargaining power to acquire various raw materials from them at prices much lower than the cost it takes to extract them. This is not a bad thing; it is what every major country in the world is trying to do, and is what every trader in a market attempts to do on a daily basis. It’s just that America is better at it than other nations.
Q07: Rot! Trade is better for all parties involved.
Ans: Not if one party is in a much stronger position than another. Large nations work very hard to prevent a truly level playing field. It’s very hard for a small, poor country to walk away from a tough deal.
Q08: Even if nations are relevant, why don’t we stop at Indian states? Why shouldn’t Rajasthan, West Bengal and Tamil Nadu be separate countries? Why do we need the whole of India to be one country?
Ans: Because larger countries have more bargaining power than smaller ones. It’s possible to get too large — when there are not enough foreigners to exploit and internal management becomes hard. But until we reach that point, it is best to grow larger.
Q09: Then why shouldn’t India annex more land and become an even bigger country?
Ans: If we can, we should. China knows this; that’s why China seized Tibet. But we need to make sure the negative consequences of such an action don’t outweigh the gains.
Q10: Well, the USA can certainly annex more land. Why doesn’t it do so?
Ans: The fallout from such an action would have an unjustifiable cost for the USA. It is so stable and has such a high SoL that managing a population of unwilling conquerees would lower the overall American SoL. Even integrating willing conquerees into American society would be very costly. Increasing the American SoL at this point is much more easily accomplished by projection of soft power.
Van Gogh’s paintings Starry Night and Cafe Terrace at Night stir something deep. My interpretation (which van Gogh probably never intended) is that they are a contrast between the warm, familiar fold of civilization and the wild unkown mystery of the celestial sky. In Starry Night, it is as if the monumental forces lying in the hearts of suns and galaxies have descended onto the cozy hamlet of Saint-Rémy, which is getting ready to tuck in for the night, unaware and unconcerned about the fantastic forces at work in deep space.
The same sentiment is stirred by Cafe Terrace at Night: the warmth of familiar surroundings and human company contrasted to the unknowns in the surrounding dark streets, and even more, the unknowns up in the sky. I can’t decide what I want to be: a diner at the cafe or a predator lurking in the dark alleys, looking at the diners and waiting for one of them to leave that safe haven.
What is religion, why do we need to have faith, why do we need gods?
Life includes a series of decisions. Decisions help us optimize our condition, find a route to another condition that is better, more stable, easier or happier. But the number of minute decisions that need to be made is so large that our built-in computer, the brain, is overwhelmed by the computational requirements.
So it takes shortcuts. It categorizes the decisions, pushing some, such as picking up the next spoonful of food or stepping aside to avoid a pothole, into a subconscious decision making queue. Others are not so subconscious but are still routine jobs, like signing your name on a credit card bill or going to work in the morning. Even with these reductions on its computational requirements, the brain would be left with too many significant mid- and long-term decisions.
Religion is the knowledge applicable to another subcategory of these remaining decisions. In many cases, it quickly allows us to use the past experience of wise people to determine a course of action when faced with certain decisions. Trying to figure every one of these out for oneself would put too much of a computational burden on the brain. Religion gives quick answers, without always requiring us to think hard.
Of course there are still a lot of decisions that can’t be addressed by religious knowledge, and which might require individual thinking. But religion helps quite a bit; a lot of right-and-wrong type decisions can be solved quickly by referring to religious knowledge.
Vaguely, this is what the title means: Suppose John is a bad influence on Bob, and Bob robs Dave. Should we say that John is responsible or Bob is? I think it is possible to say that both are.
I’m sure legal systems have thought about this sort of thing a lot…
China seems to lose all sense of proportion and balance when it comes to the Dalai Lama, who was recently awarded the Congressional Gold Medal by President Bush.
China has “summoned” the US ambassador to convey that ties had been “gravely eroded”. Their spokesperson claims that “The move of the United States is a blatant interference in China’s internal affairs, hurts the feelings of the Chinese people and has gravely undermined relations between China and the United States.”
According to this article:
Liu said Chinese Foreign Minister Yang Jiechi summoned US Ambassador to Beijing, Clark Randit and lodged a “solemn protest” for disregarding repeated Chinese requests not to honour the Dalai and prevent senior US leaders from meeting him.
The Chinese government is probably the only entity that fails to realize how ridiculous these claims are. By asking the US to prevent its leaders from meeting the Dalai Lama, it is China that is interfering in the affairs of the US. Before setting about summoning US ambassadors, China would do well to remember that (unlike Russia) it owes 100% of its current prosperity and technology to the US. And the Chinese melodrama about their commie government choosing the next Dalai Lama is the most ridiculous farce ever. By what right do the Chinese commies choose the Dalai Lama, the leader of a religion that stands for the opposite of everything the commies do?
It is well known that Australian cricket uses all the resources at its disposal to advance whatever causes it has. In addition to using the best training available, Australians also use sledging to win matches. Australians also tend to be “forgiven” more easily for on-field confrontations than cricketers from the subcontinent, and are good at being the first to level pre-emptive, or first-strike, accusations of a variety of sorts at everybody. Such allegations include accusations of cheating. Almost all their accusations have proved unfounded.
In the last one year, Australian cricket has started its most ridiculous accusation fad yet. Increasingly, allegations of Indian racism have begun emerging out of Australia. Darrell Hair was the first to do this. In the fifth India-Australia one-day international in India this year, the Australians began accusations of “racial abuse” by Indian spectators. Both allegations are ridiculous, and the Australian cricket board knows it. That is why both allegations were never acted upon by them… they know such allegations wouldn’t survive any sort of scrutiny.
However, the accusations do serve to muddy the waters and set precedents for accusations of Indian racism. After several years of such accusations, they will become sufficiently well-entrenched to be taken seriously.
The only remedy for such accusations is for cricket bodies to investigate them and expose them for the frivolous sensationalism they are. This would diminish the credibility of the Australian cricket board, forcing them to think twice before throwing such accusations around.
Cricket statistics indicating performance for batsmen commonly include the number of innings played, batting average, number of ducks, and scoring rate. One crucially important statistic that is missing from this list is any measure of consistency.
Inconsistent batsmen who tend to score big occasionally can have inflated averages that belie their true worth to the team. Consistency is much more valuable than the number of centuries scored. Thus it is important to include a measure of consistency.
What measures might be good? Variance, absolute deviation and entropy are all decent absolute measures of consistency. At least one of these should be reported along with the other statistics for every batsman.
Many attempts to define determinism, the philosophical notion that everything that happens in the universe is pre-ordained or pre-decided, involve the notion of causality. A causal chain or graph of events, driven by the laws of physics, is supposed to explain how determinism can be. In this view, the state of the universe at any time is determined once we have an initial state of the universe and a set of physical laws which allow us to compute the state of the universe at any time. Of course, since we are also part of the universe and are thus subject to its laws, some thinkers construct these arguments from the viewpoint of a hypothetical “demon” residing outside the universe and unaffected by the universe’s laws.
In what follows, I will argue that the most common notion of causality, based on counterfactual outcomes, is meaningless in a deterministic universe. We may have to adopt a definition of causality which relies on computability within the universe: A causes B if we can start with state A and compute a sequence of state changes induced by the laws of the universe, ending in B.
Counterfactual Causality Fails in a Deterministic Universe
According to the Wikipedia entry on determinism:
Causal (or nomological) determinism is the thesis that future events are necessitated by past and present events combined with the laws of nature.
The Wikipedia entry on Causality has this to say:
The philosopher David Lewis notably suggested that all statements about causality can be understood as counterfactual statements. So, for instance, the statement that John’s smoking caused his premature death is equivalent to saying that had John not smoked he would not have prematurely died.
The incompatibility between determinism and causality is now easy to see: if causality is defined counterfactually, then any event A which occurs before an event B is causally responsible for B. This is because the statement “If A had not occurred, then B would not have occurred” is meaningless in a deterministic universe. “If A had not occurred” is like saying “If 1 equals 2″, because determinism says that A occurring is the only possibility. Thus, if A occurs before B, then A is causally responsible for B.
Causality as Computation
Perhaps a modified definition of causality will help take care of this problem. Suppose that, by “A causes B”, we mean that a computer within the universe is able to find a chain of applications of the laws of the universe which takes the universe from state A to state B (via some sequence of intermediate events). Then we can say that A causes B. Note that this definition refers to the ability to compute or the ability to understand.
The definition is not yet valid, however. What if, given any two events A and B, we can compute such a sequence of intermediate events? Then this definition would be no more useful than the previous one based on counterfactuals. We may have to abandon an attempt to define causality as either true or false (A causes B or A does not cause B) and accept a definition based on degrees of causality. Thus, if the chain of intermediate events going from A to B is long, we say the relationship is “less causal”, and if it is short, we say it is “more causal”.
What is Ethics? What is the foundation for ethics? Do we need religion for ethics? Can a mechanical (soulless, purely physics-driven) being have ethics? How can ethics be derived in a deterministic universe without free will? The Optimization viewpoint. Can there be an Ultimate Logical Justification for any system of ethics?
Ethics Without Soul
There has recently been a lot of controversy about Atheist ethics. Ethical systems have, traditionally, been tied to religion. Since religions became widespread, the primary motivation for ethical behaviour has been religious. Each religion has its own ethical system. Almost all religions specify carrot-and-stick reasons for behaving ethically. In the Abrahamic religions, heaven and hell are the carrot and the stick. In Hinduism, nirvana and demotion in the “highness” of being are the carrot and the stick. Not all religions insist on the existence of one universal “God”, but Atheists often remain unattached to any of the usual religions in addition to a lack of belief in a “God”. The question then arises: Can Atheists behave ethically?
More generally, the question can be posed for any mechanistic system (a system ruled only by the laws of physics and not by any agent, such as a “soul”, connected to religion). Mechanistic systems include humans, other organisms, robots, and any other objects or phenomena. (Whether humans are mechanistic is a subject of much debate; see Strong and Weak Artificial Intelligence and Gödel, Penrose and Artificial Intelligence — Simplified.) What does ethics mean for a mechanistic system?
The Goal of Ethics
I think ethics can be viewed as a mechanism for preservation or proliferation of complexity. Complexity is precious; the entropy grindstone is constantly trying to destroy it (the second law of thermodynamics). Every ethical principle we have can be seen as ultimately for complexity. Here are some examples.
For example, we prize human life over that of all other animals. This is consistent with complexity preservation: humans are more complex than other animals. We think killing an animal for no reason is unethical; we feel no such thing about smashing a rock. This is also consistent with complexity preservation: an animal is more complex than a rock.
A lot of things are not directly connected to complexity preservation, but come about because we need simple rules of thumb that we can follow easily. Lying is considered unethical. In the long term, this helps preserve social order and thus helps preserve the human species.
Thus mechanistic systems can have ethical behaviour – behaviour which eventually tends to preserve or increase complexity. Atheists can be as ethical as anyone else, as can a robot, as long as their actions are directed towards optimizing complexity.
Thus we have converted the problem of constructing ethical systems to an optimization problem. The objective function (which we are trying to maximize) is overall complexity. Ethics can now be viewed as rules of behaviour following whom tends to increase complexity.
Our Ethical Principles
So this tells us what ethics is about, and what ethics aims to do. But it still doesn’t tell us how a mechanistic individual should develop his/her sense of ethics. A person can hardly be expected to think of some far-off big-picture complexity goal when deciding what constitutes good ethics. How can the above definition be made practical?
First, by recognizing what the eventual goal of ethics is, we have converted the construction of ethical principles into an optimization problem. This is a good first step, since we now know what it is we are trying to do when we talk about acting ethically.
Our solution to the optimization problem does not always rely on the objective function of complexity, but rather relies on the observation that various human institutions (societies, religions, legal systems) have already come up with rules of thumb for this optimization. Once we recognize this, we use our judgment to decide which of the existing rules are relevant to overall preservation of complexity and adopt an ethical system based on these rules. This solution may not be perfect, but it is more important that the ethical rules be easy to remember and follow – what use is a perfect but unintelligible and impractical rule? It is preferable, I think, to find simple and general rules, and avoid special cases and exceptions as much as possible.
What’s more, once we recognize this as a valid scheme for the generation of ethical principles, we can free ourselves from the past. Faced with a new situation, we can find ethical rules tailored to the new situation, rather than trying to search for rules buried in existing religious systems that are applicable. A religious system may be able to help, but the effort of trying to reconcile religion with the new situation is often not worth it.
Time and space are modes in which we think and not conditions in which we live — Einstein
We have a specific way of perceiving things. For example, our mind perceives the world through a four-dimensional model: 3 spatial dimensions, and one (unidirectional) time dimension. But is this the only way the world around us can be perceived?
It is clear that, as long as there is a one-to-one mapping between one representation and another, any two representations of any piece of information are equivalent. For example, it does not matter whether we store a position in polar or Cartesian coordinates – because we have a one-to-one map from one to the other.
So, imagine that we meet an alien species. Would they necessarily have a unit of distance? Could it be that, instead of (x, y, z, t), they perceive (tx, ty, tz, t^3)? Their unit of measurement would then have distance and time entangled together. They might say, “walk for 125 cube-seconds” (equivalent to us saying “walk for 5 seconds”). Our statement “the car is 10 kilometres away and the time now is 125 seconds” would translate to “the car is 50 km-seconds away”. Is there a logical reason why every species should perceive in the same units that we do? Maybe not!
This needn’t be restricted just to distance and time. A species might perceive taste and colour together, or even distance confounded with emotional state. “That’s red-sweet, my friend, but it’s happy-far!”