Fair Representation and a Linear Shapley Rule

When delegations to an assembly or council represent differently sized constituencies, they are often allocated voting weights which increase in population numbers (EU Council, US Electoral College, etc.). The Penrose square root rule (PSRR) is the main benchmark for fair representation of all bottom-tier voters in the top-tier decision making body, but rests on the restrictive assumption of independent binary decisions. We consider intervals of alternatives with single-peaked preferences instead, and presume positive correlation of local voters. This calls for a replacement of the PSRR by a linear Shapley rule: representation is fair if the Shapley value of the delegates is proportional to their constituency sizes.


Introduction
advertised the Shapley value as "A method for evaluating the distribution of power in a committee system" almost immediately with the value's introduction by Lloyd S. Shapley (1953). Their motivation included not only the problem of measuring a priori voting power in a given weighted voting system or in multicameral legislatures such as the US Congress, but they explicitly referred to the design of decision-making bodies and asked: "Can a consistent criterion for 'fair representation' be found?" (p. 787). This question was later taken up, and tentatively answered, by Riker and Shapley (1968).
Numerous studies in political science, economics, and business have since invoked the Shapley-Shubik index (SSI) -which is simply the specialisation of the Shapley value to the class of monotonic simple games (N, v) where v : 2 N → {0, 1} categorizes coalitions S ⊆ N as winning or losing according to a given decision rule. It has been used to examine the division of power in committees, shareholder meetings, councils, and assemblies or to assess the power shifts caused by EU enlargements, changes of treaties, etc. See Felsenthal and Machover (1998), Laruelle and Valenciano (2008), or Holler and Nurmi (2013) for overviews. This wide application in power analysis notwithstanding, the Shapley value's role as a tool for designing political institutions is probably outshone by a fairness benchmark which relates to the Banzhaf value. The latter was introduced to the game theory community by Dubey and Shapley (1979), when they provided comprehensive mathematical analysis of a voting power index proposed by the lawyer John F. Banzhaf (1965). Banzhaf's interest in voting power was sparked by the US Supreme Court's series of 'one person, one vote' rulings in the 1960s. His index was popularized in later legal cases. Without Shapley's, Shubik's, Riker's, and Banzhaf's knowing, an equivalent power measure had already been investigated, and the question of fair representation partly been settled by statistician Lionel S. Penrose (1946Penrose ( , 1952. 1 With the newly established United Nations in mind, Penrose studied two-tier systems in which constituencies (members countries, states, etc.) of different sizes elect one delegate each to a decision-making assembly. He explained how proportional weighting would give voters in larger constituencies disproportionate power. Rather, the problem of giving equal representation to all constituents is solved by choosing top-tier voting weights 1 The measure was again independently proposed by social scientists Rae (1969) and Coleman (1971). See Felsenthal and Machover (2005) for the curious history of "mis-reinvention" in the analysis of voting power. such that the resulting pivot probabilities (i.e., Banzhaf value) of the delegates is proportional to the square root of the represented population sizes. This result -rederived by Banzhaf (1965) and sketched by Riker and Shapley (1968) -is now known as the Penrose square root rule (PSRR). It and the corresponding Penrose-Banzhaf index became the key reference for many applied studies on federal unions and two-tier voting systems such as the US Electoral College (e.g., Grofman and Feld 2005;Miller 2009Miller , 2012, the Council of the EU (e.g., Felsenthal and Machover 2004;Fidrmuc et al. 2009), or the IMF (e.g., Leech and Leech 2009).
The PSRR follows straightforwardly from assuming that citizens at the bottom tier vote independently of each other, with equal probabilities for a 'yes' and a 'no'. The objective is that each voter shall have the same probability to cast a decisive vote, i.e., to swing the local decision and thereby the global one. 2 It is often forgotten, however, that the rule lacks justification if voters' decisions are statistically dependent (cf. Chamberlain andRothschild 1981 or Gelman et al. 2002) or for non-binary decisions.
We here propose to replace Penrose's binary model of two-tier voting by a continuous median voter environment, also analyzed by Kurz et al. (2016). The model gives a simple explicit micro-foundation for using the SSI rather than the Banzhaf value in two-tier voting analysis. Moreover, one arrives at a linear Shapley rule instead of Penrose's square root one.
Our voters are assumed to have single-peaked preferences over an interval of alternatives, not merely 'yes' and 'no'. Their delegates represent the median preference of the constituency in the considered assembly. This assembly applies a weighted majority rule. One hence obtains policy outcomes which equal the ideal point of the median voter of the constituency whose delegate is the assembly's weighted median. If there are many voters and the ideal points of their preferences have a continuous distribution and positive correlation within each constituency (while being independent across constituencies, as in Penrose's model), then the probability of a given delegate being decisive in the assembly asymptotically approaches the Shapley value of the delegate, not the Banzhaf value. Because the influence of a given voter on the position adopted by his or her delegate is inversely proportional to the constituency population -not to the square root as for binary options -this implies: if voters shall have the same indirect influence on outcomes, the weighted majority game among their delegates 2 For odd population size n i = 2k + 1, decisiveness of voter l inside constituency C i coincides with an even split between the 2k other voters. The probability of this event, with individual 'yes' and 'no'decisions being equally likely and independent, is 2 −2k · 2k k . By Stirling's formula, this is approximately 2/(π · n i ). Decisiveness of constituency C i at the top tier is captured by its Penrose-Banzhaf index in this setup, which must hence be rendered proportional to √ n i by choosing appropriate voting weights. must have a Shapley value proportional to the represented population sizes.
This linear Shapley rule does not require strong correlation of individual opinions provided constituencies are as sizable as in most real applications. The assumption of some preference affiliation within the constituencies suffices. It seems at least as natural as that of statistical independence also from a constitutional a priori perspective. In particular, if all voters were perfectly exchangeable then there should be no objection to redrawing the constituency boundaries. One could then design constituencies to be approximately equal in terms of population size and the question of which voting weights to use would become redundant.
The identified linear Shapley rule does generally not imply that voting weights be proportional to population sizes. This holds only in the limit as the number of constituencies increases (Neyman 1982, Lindner and. The analysis hence strictly refines the simple intuition that 'one person, one vote' calls for weights themselves to be proportional to populations. 3 One should identify voting weights such that the resulting majority game implies a desirable Shapley-Shubik index. That is, one needs to solve the inverse problem of the SSI.
We will point to some more of the related literature on two-tier voting systems in the next section and then present our median voting model in Section 3. We formalize the fair representation problem in Section 4. The main result is derived in Section 5 and we discuss practical aspects of it in Section 6. We conclude in Section 7.

Related Literature
Most research on the design of two-tier voting systems has maintained the basic binary framework adopted by Penrose (1946Penrose ( , 1952, Banzhaf (1965), and Riker and Shapley (1968). One major strand of literature -including Chamberlain and Rothschild (1981), Gelman et al. (2002), Gelman et al. (2004), and Kaniovski (2008) -has considered relaxations of the assumption that individual 'yes' or 'no' decisions are independent and uniformly distributed. It has turned out that Penrose's square root rule lacks robustness. In particular, strictly less concave weight allocations are necessary if decisions exhibit positive correlation within constituencies.
A second big strand has considered other objectives than fair representation. The most salient alternative is optimal representation in the sense of maximal utilitarian welfare. Barberà and Jackson (2006) and Beisbart and Bovens (2007) derived that, generally speaking, constituencies' voting weights need to be proportional to the expected utilitarian importance they attach to an issue. This means that welfare is maximized by square root weights in case of independent voter preferences but by proportional weights in case of perfect alignment within the constituencies. Koriyama et al. (2013) relatedly considered welfare with the twist that a voter's utility is not additive across multiple issues but a strictly concave function of the frequency with which the collective 'yes' or 'no' decision conforms to the individually preferred outcome. This generally calls for voting weights to be strictly concave in constituency sizes.
Other studies have considered the majoritarian objective of selecting two-tier voting weights which, in a suitable sense, bring the implied top-tier decisions as close as possible to the decisions which would have resulted in a single encompassing constituency, i.e., in a direct referendum. Clashes between the outcomes of direct and indirect democracy -instances of the so-called referendum paradox (see, e.g., Nurmi 1998) -are impossible to avoid; a prominent case was the election of President Bush by the US Electoral College against the popular majority in 2000. Felsenthal and Machover (1999) have investigated the 'mean majority deficit' of a two-tier system, referring to the difference between the size of the popular majority camp and the number of citizens in favor of the assembly's decision. Kirsch (2007) instead considered the mean quadratic deviation between the shares of 'yes'-votes at the bottom and top tiers, while Feix et al. (2008) sought to minimize the probability of the top-tier decision being at odds with the majority of citizens. All three studies identified a key role for weight assignments that relate to the square root of constituency sizes if voter opinions are independent and identically distributed (i.i.d.). However, Kirsch (2007) and Feix et al. (2008) give warning that correlated opinions at the constituency level may call for proportionality to the numbers of represented voters. This dichotomy was confirmed also in simulations by Maaser and Napel (2012) which left the binary Penrose-Banzhaf framework. Their objective was to minimize expected distance between the positions of the decisive delegate at the top tier and the electorate's median voter in case of an interval of policy alternatives.
For the same convex policy environment, which we will also study here, Maaser and Napel (2007) and Kurz et al. (2016) have turned to the original question of 'fair representation'. 4 If the ideal points which characterize voters' single-peaked preferences are i.i.d., fair weights become proportional to the square root of population sizes as the number of constituencies increases. In view of asymptotic results by Lindner and Machover (2004), this matches Penrose's original conclusion even though the respective square root findings obtain from the superposition of very different effects. Crucially, voting weights proportional to constituency sizes quickly perform better if positive preference correlation is introduced.
Laruelle and Valenciano (2007)  proportional to the represented constituents, exactly as our Corollary 1 asserts below.
With this exception and that of Riker and Shapley (1968), the Shapley value or SSI has so far, to the best of our knowledge, not featured as a benchmark for fair two-tier voting systems -despite its frequent application in positive analysis of voting power.
Riker and Shapley provided no explicit mathematical analysis in their article. In the wake of the US Supreme Court's decisions in Baker v. Carr and Reynolds v. Sims, they focused on the delegate model of representation where each representative acts as a funnel for binary majority decisions in his or her constituency. They argued, but did not prove, that a square root rule based on the SSI solves the problem in this model.
Much more briefly, Riker and Shapley (1968) also discussed the Burkean trustee model of representation. In that, representatives are 'free agents' who "seek to satisfy the general interest" (p. 211) of their constituency rather than the interests of the winning majority. Under the ad hoc assumption that such a free agent's SSI can be divided among all his constituents in equal measure, Riker and Shapley concluded for this case that a representative's SSI needs to be proportional to the number of voters in his or her constituency. Our analysis derives the same conclusion from an explicit delegate model. The key distinction to the setting of Riker and Shapley (1968) is that general pattern in the literature -square root vs. linear weighting rules for independent vs. correlated preferences -is confirmed once more.
we consider many rather than only two policy alternatives and incorporate preference correlation at the constituency level.

Two-Tier Median Voter Model
We assume the same median voter framework as Kurz et al. (2016), and partly draw on the presentation therein. Take a population of n voters and let C = {C 1 , . . . , C m } be a partition of it into m < n constituencies C i with n i = |C i | > 0 members each. The preferences of each voter l ∈ {1, . . . , n} = i C i are assumed to be single-peaked over a finite or infinite real interval X ⊆ R, i.e., a convex rather than binary policy space.
The respective peaks or ideal points are taken to be identically distributed and mutually independent across constituencies. However, we allow for a particular form of preference correlation within each constituency.
Specifically, the ideal point ν l of voter l in constituency C i is conceived of as the realization of a continuous random variable where t · µ i is a constituency-specific shock. Random variable µ i has the same continuous distribution H for any i ∈ {1, . . . , m}, with a bounded density and finite variance σ 2 H . The scalar t ≥ 0 parameterizes the similarity of opinions within the constituencies. Voter-specific shocks l account for individual political and economic idiosyncrasies. They are presumed to have the same continuous distribution G for all l ∈ {1, . . . , n} with finite variance σ 2 G . The respective density is assumed to be positive and continuous at G's median. This rules out the possibility of a gap between 'left' and 'right' opinions, which would generate a binary model through the backdoor.
A given profile (ν 1 , . . . , ν n ) of ideal points could reflect voter preferences in abstract left-right spectrums or regarding specific one-dimensional variables such as the location or scale of a public good, an exemption threshold for regulation, a transfer level, etc. Variance σ 2 G is a measure of heterogeneity within each constituency. Variance t 2 σ 2 H of t · µ i is a measure of heterogeneity across constituencies. Preferences in all constituencies vary between left-right, high tax-low tax, etc. in a similar manner, but the constituencies' ranges of opinion are typically located differently from an interim perspective. Still, all ideal points are a priori distributed identically, i.e., we adopt a constitutional 'veil of ignorance' perspective which acknowledges that ν l and ν k are correlated with coefficient t 2 σ 2 H /(t 2 σ 2 H + σ 2 G ) whenever l, k ∈ C i . On any given issue, a policy x * ∈ X is selected by an assembly of representatives which consists of one delegate from each constituency. 5 Without going into details regarding the procedure for within-constituency preference aggregation (bargaining, electoral competition, or a central mechanism) we assume that the preferences of C i 's representative coincide with the respective median preference of the constituency. So the location of the ideal point of representative i is We admittedly put aside at least two practical problems with this assumption.
First, systematic abstention of certain social groups can drive a substantial wedge between the median voter's and the median citizen's preferences, and non-voters go unrepresented. Second, due to agency problems, a representative's position may differ significantly from his district's median. 6 In the assembly, constituency C i has voting weight w i ≥ 0. Any coalition S ⊆ {1, . . . , m} of representatives which achieves a combined weight j∈S w j abovẽ q ≡ q m j=1 w j for q ∈ [0.5, 1) is winning and can pass proposals to implement some policy x ∈ X. This voting rule is denoted by [q; w 1 , . . . , w m ].
Now consider the random permutation of {1, . . . , m} that makes λ k : m the k-th leftmost ideal point among the representatives for any realization of λ 1 , . . . , λ m . That is, λ k : m is their k-th order statistic. We disregard the zero probability events of several constituencies having identical ideal points and define the random variable P by P ≡ min j ∈ {1, . . . , m} : j k=1 w k:m >q .
Representative P : m will be referred to as the pivotal representative of the assembly.
In the case of simple majority rule, i.e., q = 0.5, the ideal point λ P : m of representative P : m cannot be beaten by any alternative x ∈ X in a pairwise vote, i.e., it is a so-called Condorcet winner and in the core of the voting game defined by ideal points λ 1 , . . . , λ m , weights w 1 , . . . , w m and quotaq. 7 We take to be the collective decision taken by the assembly. We do so also in the non-generic cases of the entire interval [λ P−1 : m , λ P : m ] being majority-undominated in order to avoid inessential case distinctions. 8 The situation under supermajority rules is somewhat less clear-cut. A relative quota q > 0.5 typically induces an entire interval of undominated polices, instead of a single Condorcet winner. Still, representative P : m defined by (5)  So equation (6) generally identifies the policy outcome for the given quota.

The Problem of Fair Representation
The event {x * = ν l } of voter l's ideal point coinciding with the collective decision under these presumptions almost surely entails that small perturbations or idiosyncratic shifts of ν l translate into identical shifts of x * , i.e., ∂x * /∂ν l = 1. Voter l can then meaningfully be said to influence, be decisive or critical for, or even to determine the collective decision. This event has probability 7 Note that for x * determined in this way, no constituency's median voter has an incentive to choose a representative whose ideal point differs from her own one, that is, to misrepresent her preferences (cf. Nehring and Puppe 2007). 8 A sufficient condition for the core to be single-valued under simple majority rule is that the vector of weights satisfies j∈S w j q m for each S ⊆ {1, . . . , m}.
9 Status quo x • might also vary randomly on X. Then the quantity π i (t) below captures i's pivot probability conditional on policy change. Justifications for attributing most or all influence in a committee to representative P : m in the supermajority case date back to Black (1948). The focus on the core's extreme points can be motivated, e.g., by distance-dependent costs of policy reform. which depends on the joint distribution of (ν 1 , . . . , ν n ) and the voting weights w 1 , . . . , w m that are assigned to the assembly members. Even though p l will be very small if the set of voters {1, . . . , n} is large, it would constitute a violation of the 'one person, one vote' principle if p l /p k differed substantially from unity for any l, k ∈ {1, . . . , n}.
Our objective of achieving fair representation can thus be specified as follows.
Given a partition C = {C 1 , . . . , C m } of n voters into constituencies, and distributions G, H and a parameter t ≥ 0 which together describe heterogeneity of individual preferences within and across constituencies, we seek to find a mapping from n 1 , . . . , n m to weights w 1 , . . . , w m such that each voter a priori has an equal chance of determining the collective decision x * ∈ X -that is, such that p l /p k ≈ 1 for all l, k ∈ {1, . . . , n}.
The model's statistical assumptions imply that p l = p k holds for l, k ∈ C i irrespective of which specific G, H, t, and voting weights w 1 , . . . , w m are considered. Namely, the continuity of G entails that if l ∈ C i then So an individual voter's probability to be his or her constituency's median and to determine λ i is inversely proportional to constituency C i 's population size. This will need to be compensated via his or her delegate's voting power in the assembly.
The events {ν l = λ i } and {x * = λ i } are independent. (The first one only entails information about the identity of C i 's median, not its location.) It follows that the probability p l for an individual voter l ∈ C i influencing the collective decision x * is 1/n i times the probability of event {x * = λ i } or, equivalently, of {P : m = i}. Letting denote the probability of constituency C i 's representative being pivotal in the assembly for a given parameter t, a solution to the problem of fair representation hence consists in mapping constituency sizes n 1 , . . . , n m to voting weights w 1 , . . . , w m such that π i (t) π j (t) ≈ n i n j for all i, j ∈ {1, . . . , m}.
Note that if the representatives' ideal points λ 1 , . . . , λ m were not only mutually independent but also had identical distributions F i = F j for all i, j ∈ {1, . . . , m} then all orderings of λ 1 , . . . , λ m would be equally likely. In this situation, player i's probability of being pivotal π i (t) would simply be i's Shapley value φ i (v) (see Shapley 1953), where v is the characteristic function of the m-player TU game in which the worth v(S) of a coalition S ⊆ {1, . . . , m} is 1 if j∈S w j >q and 0 otherwise, and Yet, under the normatively attractive 'veil of ignorance' assumption that individual voters' ideal points are identically distributed, the ideal points λ 1 , . . . , λ m of their representatives will only have an identical distribution in the trivial case n 1 = . . . = n m .
Otherwise, a smaller number n i < n j of draws from the same distribution generates a sample whose median λ i has greater variance than the respective sample median λ j . Technically, π(t) corresponds to a random order value where the 'arrival time' distributions are mean-preserving spreads of each other (see, e.g., Monderer and Samet 2002, Sec. 4). Kurz et al. (2016), for a simple majority quota q = 0.5 in the assembly, study how the sample size effect on the realized medians gives a pivotality advantage to the delegates from large constituencies. For instance, in the i.i.d. case with t = 0, n j = 4 · n i implies that the delegate from constituency C j is twice as likely to be the unweighted median among the delegates, i.e., π j (0) = 2 · π i (0), if n i is sufficiently big. A fair allocation then needs to give delegate j only about twice the weight of delegate i in order to satisfy (11). More generally, the observation that the density of the sample median λ i at the expected location of λ P:m is proportional to the square root of sample size n i gives rise to a square root rule as m → ∞ in case t = 0. See Kurz et al. (2016, Sec. 4) for details.

Fair Representation with Affiliated Constituencies
We here study the case t > 0 and keep the number m of constituencies fixed. We thus capture the realistic scenario in which a big electorate is partitioned into moderately many constituencies. These differ not just in size but exhibit some internal similarity.
The key observation for this case of internally affiliated constituencies is that the indicated sample size effect for the distribution of the median voter only pertains to the idiosyncratic components of delegates' preferences, i.e.,˜ i = median { l : l ∈ C i }.
In particular,˜ i 's variance is approximately 1 2 π · σ 2 G /n i (see, e.g., Arnold et al. 1992, Thm. 8.5.1). This contrasts with a constant variance of t 2 σ 2 H for the constituency-specific preference component.
The density function of delegate i's ideal point λ i is the convolution of densities of a random variable that does not vary in n i and a random variable that vanishes in n i . On the realistic presumption that σ 2 G is not bigger than σ 2 H by several orders of magnitude, the distribution of the constituency-specific shocks hence comes to dominate the distribution of individual-specific shocks as we consider population sizes in the thousands or millions.
Proof. The proposition follows from the Shapley value's definition (12) and the observation that the orderings which are induced by realizations of vectors λ = (λ 1 , . . . , λ m ) and µ = (µ 1 , . . . , µ m ) will coincide with a probability that tends to 1 as t grows. To see the latter, ignore any null events in which several shocks or ideal points coincide and letˆ (x) denote the permutation of {1, . . . , m} such that x i < x j wheneverˆ (i) <ˆ ( j) for a real vector x = (x 1 , . . . , x m ). We then have: for each permutation of {1, . . . , m}.
To prove the lemma, denote the finite variance of˜ i by σ 2 i and let U ≡ (max i |E[˜ i ]|) 3 . We can choose a real number k such that the bounded density function h of µ i , with i ∈ {1, . . . , m}, satisfies h(x) ≤ k for all x ∈ R. For any given realization µ j = x, the probability of the independent random variable µ i assuming a value inside interval (x−4t − 2 3 , x+4t − 2 3 ) is bounded above by k·8t − 2 3 . We can infer that the event |µ i −µ j | < 4t − 2 3 , which is equivalent to the event |tµ i − tµ j | < 4t 1 3 , has a probability of at most k · 8t − 2 3 for any i j ∈ {1, . . . , m}. And we can conclude from Chebyshev's inequality that by the triangle inequality. Hence, the probability for (15) to hold when t ≥ U is . . , m}. Now consider the joint event that (i) |tµ i − tµ j | ≥ 4t 1 3 for all pairs i j ∈ {1, . . . , m} and (ii) that |˜ i | < 2t 1 3 for all i ∈ {1, . . . , m}. In this event, the ordering of λ t 1 , . . . , λ t m is determined entirely by the realization of tµ 1 , . . . , tµ m ; in particular,ˆ (λ t ) =ˆ (µ). Using the mutual independence of the considered random variables this joint event must have a probability of at least for t ≥ U. The right hand side tends to 1 as t approaches infinity. It hence remains to acknowledge that any orderingˆ (µ) has an equal probability of 1/m! because µ 1 , . . . , µ m are i.i.d.
We remark that Proposition 1 does not require identity (2) to hold; the limit (13) applies also if λ i is determined, e.g., by an oligarchy instead of the median voter of C i . Moreover, it is worth noting that Proposition 1 imposes very mild conditions on densities g 1 , . . . , g m , voting weights w 1 , . . . , w m , or quotaq: the Shapley value φ(v) automatically takes care of the combinatorial particularities associated with [q; w 1 , . . . , w m ]; and the convolution with t · µ i 's bounded density, 1 t h x t , is sufficient to 'regularize' any even non-continuous distribution G i of˜ i .
Weights and power sometimes cannot be aligned to population figures even for large numbers of constituencies. In response to Riker and Shapley (1968), Robert Nozick (1968, p. 221) pointed to the following example with q = 0.5: let an assembly consist of any odd number of legislators representing groups of equal size, and one legislator who represents a smaller group. Then each of the odd number of legislators must be given the same number of votes. If the single legislator is given that weight, too, he or she would have power in excess of the size of the group; if given fewer weight, he or she would have no power at all.
Unfortunately, no useful bounds on the unavoidable gap to a given SSI target vector are known. Simple hill-climbing algorithms often deliver excellent results and good heuristic solutions exist if the relative quota q is a variable rather than given (see Kurz and Napel 2014). Still, one cannot rule out that these identify only a local minimum of the distance between the desired and the induced power vector. For m < 9, complete enumeration of voting games is the best option. Kurz (2012)   The linear Shapley rule clearly outperforms simple population weights at any level of preference polarization. 13 The gap between representation according to the linear Shapley rule and perfectly fair representation narrows quickly as t increases; 11 The treaty defined voting weights and a quota, and stipulated two other but essentially negligible criteria. The Nice rules can still be invoked in the EU until March 2017, when they will be replaced for good by the new voting system agreed in the Treaty of Lisbon. 12 We considered l ∼ U[−0.5, 0.5] and µ i ∼ N(0, σ 2 H ) with σ 2 H = 10 −8 . Estimates of the induced pivot probabilities π i (t) and hence deviations from the democratic ideal were obtained by Monte Carlo simulation. We used the Nelder-Mead method in order to solve the underlying inverse problems. 13 In view of the limit results by Neyman (1982) and Lindner and Machover (2004), it is noteworthy that there is still a noticeable advantage even for the relatively big number of 28 constituencies. The advantage can be expected to be higher for examples with smaller m. it is already close to zero for t ≈ 10. One can also see that the lead of the Shapleybased weights over simple population weights is more pronounced for the higher vote threshold in panel (b), in line with our earlier comments on the inverse problem.

Concluding Remarks
When Lloyd S. Shapley and his collaborators contemplated the problem of fair representation, they already mentioned proportionality to the Shapley value as a possible benchmark. Riker and Shapley (1968) did not give it much emphasis, however, compared to a square root recommendation in the tradition of Penrose (1946,1952) and Banzhaf (1965). The key reason to us seems their focus on perfectly exchangeable voters.
This may actually be the most appropriate assumption in applications to, say, a federal state with high geographic mobility, like the US. However, when constituencies correspond to entire nations as in case of the Council of the EU or the ECB Governing Council, voters in a given constituency tend to share more historical experience, traditions, language, communication, etc. within constituencies than across (see Alesina and Spolaore 2003). Many lower key institutions with a delegate structure such as university senates, councils of NGOs, boards of sport clubs, etc. involve constituencies (faculties, divisions, and so on) whose composition involves sorting of like-minded individuals. Some preference similarity within and dissimilarity across constituencies thus often seems more plausible.
Our continuous rather than binary voting model then implies that equal expected influence on outcomes requires proportionality between a constituency's size and the respective probability -approximated by the Shapley value -of getting its way, i.e., of seeing its median voter's preferences implemented. We provided motivation for such proportionality by considering an individual's probability to be pivotal in his or her constituency, noting that it is the inverse of the respective population size. This is not the only possible motivation for proportional pivotality at the top tier.
As Kurz et al. (2016) explain in more detail, one can also operationalize the influence of a given individual by considering the expected effect of participation. Namely, every local voter almost always has influence on the position of the respective constituency median, and hence delegate, because abstention and consequent deletion from the considered sample would shift the realized median position. For instance, if a voter with an ideal point to the left of λ i is removed from the preference sample in C i , its median shifts to the right; a faithful delegate will then pursue a position λ i > λ i in the assembly. The expected size |λ i − λ i | of such a shift -and also a voter's incentives to turn out -falls in C i 's population size and, specifically, can be shown to be asymptotically proportional to 1/n i . Therefore the same condition (11) follows.
Links from a constituency's pivotality to electoral campaign efforts and pork barrel funds also allow to arrive at Corollary 1 on equal treatment grounds.
In contrast to the square root result derived by Kurz et al. (2016) for t = 0, Corollary 1 applies to arbitrary vote thresholds in the assembly. This admittedly involves a weaker notion in which the representative identified by equation (5) is 'decisive' when q > 0.5 compared to q = 0.5. Still, it gives the linear Shapley rule additional robustness, which is appealing in view of the widespread use of supermajority rules in real decision making bodies.
At a normative level, the discrepancy between the findings for i.i.d. voters and positively affiliated voters raises a non-trivial question of practical philosophy: Which kinds of inter-constituency heterogeneity shall be acknowledged behind the 'veil of ignorance'? Constitutional design with a long-term perspective should arguably assume preferences to be distributed identically in all constituencies, even though historical patterns may suggest greater conservatism, religiosity, etc. for some constituencies rather than others. There may analogously exist normative reasons outside the scope of our analysis for setting t = 0 even though t > 0 is more plausible.
Then, Riker and Shapley's (1968) main hunch about proportionality of the Shapley value to the square root of population sizes was right (at least for q = 0.5). Otherwise, the linear rule which they discussed almost in passing provides the more "consistent criterion for 'fair representation'".