March 23, 2018, 08:12:54 PM
Topic: Team-level preference adjustment for diversity
September 18, 2014, 11:41:57 AM

A general outline of an option that has not been widely discussed, to my knowledge. I would love some feedback. Its not a panacea, and there are plenty of details to work out.

Diversity in judging is a means, not an end, to diversity in general.  The means itself is complex and multi-directional.  One goal is to have more people of underrepresented groups judge more rounds.  One is to have those judges have more impact on rounds.  One is to have those judges judge more high-level rounds.  One is to have more debaters experience judging from groups that are underrepresented.  One goal is to keep these judges in the community and make them feel welcome (for some that might entail judging more, some less).  One is to have more debaters from excluded groups have judges that make them feel comfortable/welcome. It is important to keep in mind that in a complex system like a debate tournament or debate community, maximizing some of those goals will have unintended or unpredicted results.  Putting the burden of increasing the number of rounds judged by minority judges without concurrently increasing the number of minority judges necessarily increases the labor burden on those judges.  There might be correctives for the specific problem, but you can’t have more underrepresented judges in debates without making the limited pool judge more debates.   If we act to increase the number of teams that are judged by the limited pool of judges from underrepresented groups, then you necessarily decrease the number of times a team that normally prefers that judge or group will get to have that judge. 

Different people will support different aspects at different times and for reasons that are multi-faceted and divergent.  It is not enough to say that there are other diversity problems.  This judging diversity problem is disambiguated and under-theorized.  The only seeming measure that we have is elim panels, and the only independent variable is the diversity in judging policy.  These are important, but in what way depends on what you are trying to achieve.

Short version of the proposal (More at the end)
Current ordinal and other preference systems fail because they equate preference on a neutral schema that does not recognize its history of power-relations and ideology. Most correctives offered to date do not fix this problem.
One solution would be to affirmatively preference individual team judge preferences based on their identification/inclusion in an under-represented group.
This solution works because it centers on the debaters and the effects that historical and structural exclusion have had on the community. It could have a lasting impact on the makeup of the community, and it does not require a drastic change in practices.
Specifically, tab software/decisions should give weighted preferences to each team that include a member of an under-represented group.

Current Market Failure:
Judge preferences are a market in a bounded system. Without recognizing and mapping the complexity of the system, solutions could be hugely mitigated or counter-productive.

To paraphrase Bill Black (former NDT Debater, former government regulator, current UMKC professor), "Any market-system that primarily uses individual ethics as the means to create fairness is designed to fail." We are trying to use the ethics of each team to achieve a community goal that is perceived to be in conflict. We are trying to use a purely market solution to a problem that is foundational to the market, ie choice.  The problem is that the debaters and coaches that make decisions to rank judges where they are presently ranking them perceive that they are doing that out of neutral goal of maximizing competitive success.  They also likely perceive that they have enough information to make this neutral determination in a neutral fashion. This individual-level goal is incommensurable with market-level goal of justice/fairness, and the solution has to recognize the dynamic. The resultant failure could be classified as a tragedy of the commons because no individual choice can save or destroy the commons.  In systems theory or game theory, this lack of justice is an emergent behavior that is the result is a Gresham’s Dynamic, where people that game the system are likely to succeed and drive out the honest-brokers.  Even the perception of gaming the system is devastating and would likely cause even the honest-brokers to follow suit. Individuals will not likely perceive benefits on the individual-level variable of competitive success by acting ethically, they may perceive benefits on that variable by acting unethically, therefore the individual-level variable virtually ensures that the commons-level variables of fairness and justice are ignored to their detriment.

Preferences are necessary
Judge placement is an economic market problem even without preference systems. Random placement of judges just means that people in the market do not have an effective way to enact their preferences for the judge. The result of this is certainly not fairness. It presumes neutrality of the judge pool, which certainly does not seem likely. The result of random judge placement is strongly divergent, probably prefers the people that control the dialogue now, and would likely be determined more by tournament attendance preference than at the judge-pool level. I think that it’s a non-starter. If you analogize it to the economy as a whole, if we took everyone’s money, the people that currently have control of the most capital resources, human capital (education, training, experience, etc.), and interpersonal/intergroup connections  would likely succeed. That is not fairness. Our experience with this recently is the Wake Forest announcement that suggested that we use little or no preference, but instead use mutuality. The response was not positive, and Wake Forest changed the policy before it was used.

Preferences are not neutral or fair on their own.
The problem with debate’s preference system is that there is not a functional understanding of power relations and their effect on the market. If we have preferences, then it is always ok to make some determinations based on the likelihood that someone is going to vote for you given other equalities. Marginalism is the basic underlying concept of liberal and neoliberal economics, and it will help explain one of the largest problems of preferences as currently conceived.  The idea is that we can measure and know the additional utility of spending an extra dollar, hiring another hour of labor, producing another widget, etc.  Then, we make economic calculations based on that knowledge.  We only buy things if the added marginal utility is worth the loss of the utility of the money we spend to purchase the good or service.  It’s pretty much what we all learn about economics:  That there is a supply curve and a demand curve, and at some point it meets in the perfect point that is the optimal price for the good/service.  We are taught that firms and customers can know that point, and that if we aggregate it enough across an industry through competition, we will pretty much get there.

There are lots of problems with marginalism, like that it is tautological instead of predictable, but one of its biggest problems is that it ignores power relationships.  Power based on control of knowledge, resources, etc. mean that individuals or firms with more power are able to manipulate pricing.  Political power effects what is considered a marginal cost of production.  Marketing distorts demand.  It also assumes that supply and demand are independent, which can’t be true in the aggregate because each change in supply effects resources, wages, employment, etc, all which change demand in unpredictable and non-constant ways. 

In judge preferences, we tend to work with numbers as margninalists work with economics.  Since we would be considered a bounded industry, it is easy to see how comparing preferences across individuals and teams using percentages is inadequate.  We often measure placement percentages in the aggregate, for example. Placement algorithms attempt to give each team judges that fall within a certain range of preference and then try to measure that preference against the preference of the other team. That would presume that the 25th percentile judge has the same marginal utility for each team. This is certainly not the case. First, the demand curves are radically divergent.  For any given team, the utility of a judge in the 70th percentile is likely profoundly different.  The margninal utility for each judge higher in the curve is also quite different, meaning that the difference between judge 70 and 80 might be huge for one team and not very large for another team .  Its very difficult to imagine the upward and lower bound percentiles for chances of victory for any given judge for any team.  But, we assume that we can do that…or we probably wouldn’t care about preferences very much.  For some teams, that range is probably anywhere from 95% to 30%.  That is, a team may assume that any given judge will vote for them if they “objectively” win the round 90% of the time if it is their top judge.  If it is their bottom judge, they are at least as likely to get it wrong, and might even have some perceived bias that makes them very likely to get it wrong.  However, for other teams, the range is probably different—more like 90% to 2%--Meaning that, if that team presumes the ideal from their context, they will presume that some judges will virtually never vote for them, even if they are “winning”.  This would be the judge that rejects an argument’s validity out of hand, in stark contrast to the team that presumes its legitimacy.  While you might quibble with the specifics, there is virtually no doubt that this is true. The margninal utility change is vastly different for each team’s preferences.  We use broad lines like 50% or 70% because we realize the marginal difference, but we assume that it can’t be as radical as to obviate the reasonableness of those numbers.  This is probably wrong.  In reality, there is probably a divergence in the curves that are so radical as to make the similarities meaningless, and it probably happens well before the 50th percentile.  At 70%, I can imagine a very large divergent marginal utility.

Debate marginalists will suggest that this is natural and shows that the value of the team’s arguments is lower than the team thinks.  This naturalizes the disparity by saying that the “objective” winner is accurately determined by preferences.  It tends to confirm ideological bias as unnatural or the fault of the team or judge that differs from you.  This ignores that ideological bias in the economy in general, and in debate, is informed by power relations.  The ideology of the judge pool is necessarily conservative, because it is grounded in decades or millennia of development.  Any challenge to that is of course seen as not legitimate, ideologically.  This is where marginalism displays its tautological tendencies.  Because it presumes that in the aggregate, judges will make the “correct” decision if we just offer enough “equal and fair” judges.  There is no way to externally validate any given round’s accuracy, and we accept split decisions amongst judges otherwise assumed competent (or reasonably incompetent).  Very different than “ideological bias”, this is pretty much the imagined absence of ideology.

When applied to the specific problem of race and sex judge preference, the problem is even more at odds with marginalism.  Assume a pool of judges that has 100 judges with 5 black people and 10 women.   When evaluating the marginal difference of judges on the issue of race (narrowly construed here as black), the difference between the first 5 is zero, and the difference between 5 and 6 is 100%.  Same is true of women between 10 & 11.  By the way, other categories work this way, but are obviously more of a spectrum, like class, education, training, experience.  Back to our two preference categories.  If we understand that peoples context can dramatically affect their perception of reality, and that two of the largest inputs into experience and context are race and sex, the problem is profound.  Research strongly suggests that people trust and interpret speakers credibility higher if they are in categories similar to the receptor.  People speak in ways that has more impact on people that share experience.  If that is the case, then you can see that a white person has a curve on the marginal measure of race similarity that is virtually flat at 100% until the 95th judge.  Contrasted to the black debater above.  On this measure, you might say that the 5th judge on the black debater’s pref sheet is roughly equal in relation to race-credibility measure as the 95th judge on the white person’s pref sheet.  Of course, there are lots of complicating intersections, but there is no way that it could ever smooth that curve disparity out. If intercultural communication is commensurable, then it would seem unlikely that it would happen at the margins.

A woman has 10 chances in our pool to have a judge of their same sex, but a man would have 90. Once you include intersections, the disparities become even more cartoonish. Regardless of argument preference, it is very unlikely that judge that a team with a black person or woman  would estimate that the marginal utility of a judge ranked at 25 would be the same marginal utility of the same ranked judge for an all white male team. What we have done is equate non-equal choices, which is definitionally unfair.

Is it is ever ok to use ideological bias to determine prefs.
The answer is yes, but the legitimacy of ideological bias can’t be externally validated by the ability to win given the underlying bias of the judging pool.  Competitions do not become fair by making all arguments equal. Some arguments are better than others, the problem is that we have presumed the neutrality of the judges and we equate problems of intercultural difference with ideology. This presumes neutrality, and while it recognizes “ideology” this presumes an ideological margin that is roughly equal.  Teams that rely on conservative views of debate often are over-relying on the “naturalness” of the arguments because of a history of those arguments.  Some ideological biases are more legitimate than others.  We would probably all agree that discrimination against gays has always been wrong, yet the society at large has been very slow to adopt that perspective.  We deem it more legitimate to condemn a bigot than we do to condone a bigot’s condemnation of a gay person.  We would not ever consider majoritarianism a proper response to that.  I don’t want to equate plan debate as bigotry, it is not even close to similar.  The point is that some claims of ideological bias are more legitimate.  The bigot is assuredly confident in his ideological correctness too. 

Most agree that debate teams fall on a spectrum of ideological positions vis-à-vis debate practices.  We presently allow an assumed-neutral judge preference system to validate the ideologically correct vision of debate.  That judge preference system is fundamentally ideologically skewed, by roughly any measure.  Therefore, the outcome can’t be neutral or valid.  It is ok to prefer judges on the spectrum of ideological bias, but some teams are more able to do express that preference now.  A team’s ability to prefer a judge in the top 50% that thinks a plan is legitimate is 100%.  A team that does not have a plan probably does not have anywhere near that luxury for affs that don’t have a plan.  That is just one measure.  Does that make the majority right?

We all think that there are better judges than other judges.  To say that judges differ on experience and ideology is not to say that all differences between them is evacuated.  The problem is that the marginal difference is high between TEAMS, not judges, and our preference system does not value that.  If you think anything that I have said above about marginalism makes sense, then you should understand that aggregating further is not likely to improve the preferences for the already marginalized team.  Since the large marginal differences start very close to the top of the pool, aggregating out is more likely to just push the overall judging score further towards the other team.

I don’t reject MPJ, because it is too important as an ideological corrective for teams at the margins.  It should not be seen as an actual equalizer, though, because it is an inappropriate remedy. 

Our current solutions are bad

Reliance on the ethics and good behavior of the community is virtually guaranteed to fail.
I will start from the premise that most everyone that fills out their preferences does so attempting to maximize their chances of winning each round.  There are some people that do not act this way.  Some people purport to try to affirmatively place judges that are in underrepresented groups higher.  I have no reason to doubt the veracity of these claims.  There are, no doubt, limits to this option.  Very few teams would be willing to just put all the under-represented judges at the top (or wherever you feel like the position that maximizes your likelihood of having that judge).  Even teams that are attempting to fix things individually still rank or group some “diverse” judges above others, and very likely behind some non-diverse judges.  That means that virtually any system of preferences is corrupted by competitive ends.  That is not to say that preferences should not happen, but the goals are in tension. Also, one is a community goal, and one is an individual goal. The commons are rarely saved by the autonomous actions of each member of the community.

Teams that do this fit into two broad categories (perhaps a spectrum). 1. Teams that recognize that their preferences are racist/sexist and do not represent an objective competitive preference order to maximize competitive success.  These teams are playing the competitive game, but are aware that the information feeding their competitive algorithm is likely corrupted.  2. Teams that are willing to sacrifice competitive success for educational and or diversity goals.  These teams are admitting that there are real competitive negative outcomes related to their actions. Once you admit this, it is pretty difficult to sustain. Both have some laudable aspects, but both rely on the underlying premise that you can know which judge you want to maximize competitive success.  I think that it would be very difficult to objectively evaluate the veracity of the attempts for teams to do this as well.  Even internal to the team or squad or individual, the number systems are too abstract and the complexity of pairings make the concept of rational choice virtually incoherent.  Therefore, any corrective is based on a false underlying premise that you knew where you were going to place the judge before your individual corrective.  It also brings a neutrality to play that suggests that if you were to realize in the future that a judge should have been lower on the list, then it is ok to move them back down (while still correcting up). There are teams that do not affirmatively move people up as well.  This is the tragedy of the commons behind a veil of secrecy that probably ensures that nothing will change. This might also be an instance of Gresham’s Dynamic where playing “unfairly” is rewarded and virtually ensures fraud.

This is pretty normal behavior in a complex competitive system.  Individuals perceive their preferences as rational, individually-self-maximizing behavior. I have heard many people note that their preferences use all reasonably available criteria to evaluate the likelihood that they get a judge that maximizes their chances of winning in the specific and aggregate.  To each team, their preferences are perceived as objective.  There are of course games that people play already.  Put a judge that is in for 1 round very high because it is unlikely that you will get that judge.  Put the judge that you want the most at the 13th percentile to maximize the mutuality or whatever (I made that up – don’t do that).  Each team’s preference list is not an in-order ranking of whom the team feels are the best judges.  It is a list to maximize competitive success, and it would be very difficult to change that by tinkering with numbers at the margins.  These teams (likely) do not view their actions as de jure racism or sexism.  They might be aware of the aggregate impact, but the tragedy of the commons is a powerful burden.

Opt-in Affirmative Action for judges is not very effective.
The main problem with this solution is it relies primarily on the good faith of everyone involved to either admit that they are (acting) racist/sexist or be willing to take a perceived hit to their competitive goals.  It is good faith that not only cannot be externally verified, but probably can’t be internally verified.  As a preference example, it is impossible to prove that you moved someone up by 5% or that you didn’t move someone down by 5%.  Each judging pool is different, and each competition pool is different.  There is no control group.  This presents problems when this solution meets the underlying premise.  Group A tries to maximize competition with little or no lens for diversity.  Group B tries to increase diversity individually.  In a system of affirmative action, group A should be predicted to change their preferences from their normal pattern.  They may do this intentionally or unintentionally (it matters for intent, but hardly for outcome).  If a judge suddenly has an additional preference weight, then that is new information for the algorithm of objectivity that Group A teams use to determine prefs.  If a Group A team does not see their preferences as de jure racism, then why would this new information not change their calculation?  The end result for this team is that the likelihood of getting the affirmatively placed judge is largely unchanged.  Group B has a different dynamic.  They have already supposedly acted to try to fix part of this problem on their own.  Their expected behavior would be to attempt to return their preferences to their original, non-affirmative placement in order to let the institutional solution fix the problem.  The overall result is that underrepresented judges are no more likely to be placed in any given debate because of the institutional preference system.  Preferences are individual.  Surely the vast majority of people filling out preferences think that their current preferences are the appropriate, good-faith, preferences.  The other option is that there are people knowingly acting in bad faith.  Unfortunately, this system does not seem to do anything to fix that problem.  To the extent that current preferences actually do maximize competitive success and, therefore any deviation likely reduces that, any team that acts in bad faith is rewarded for their fraudulent preferences.

The problem becomes even worse with the addition of an opt-in/out procedure.  I want to preface this by saying that I understand why we would want this option.  There is a real effect on the question of increasing labor on the underrepresented group as a corrective to a problem that is not of their making.  If we do affirmative placement, it may be necessary to have an opt-out in order to alleviate this problem.  That said, I think that the opt-out might end up decreasing overall preference for underrepresented groups.  I will presume that the opt-out/in is private discussion with the tabroom.  If that is true, then all judges that might opt-in (or not opt-out) would be considered in for the purposes of individual teams’ preferences.  Even if Judge Z, who is black, opts out of the affirmative placement system, teams that do not want ol’ Z in the back of the room will move Z down accordingly.  This is likely in both Group A and Group B.  That means that for every judge that opts out, it is likely to reduce the preference of underrepresented judges in the aggregate.  Even publicizing the list does not likely help.  It very well might discourage some people from opting-in.  The stigma, while often downplayed by every party, is a real phenomenon.  People often reject preference based systems, even a system that benefits themselves, that are not grounded in merit (as defined by the already-existing power structure). 

This will not change when some judges will be given affirmative action.  The “market” knows how to correct for this.  The teams will rank judges that they perceive to be likely to get a bump lower on the pref sheet.  This happens all the time with structural -isms—individuals perceive that they are not the problem, so they continue to act in the way that benefits them.  I simply cannot imagine a highly-competitive team not scenario-planning the way to continue to receive the most favorable prefs for them to continue to operate as they currently do. 

This may seem like a cynical take on this option.  It certainly may be just that.  However, this solution is only necessary because people continue to act in a way that maximizes their own success contrary to the stated goal of this project to increase judge diversity.  It is predictable that people will continue to act as they have before.  To predict otherwise would mean that people are CHANGING their behavior.  People tend to be conservative in both their evaluation of their own behavior and the evaluation of the change that they would need to make to remedy any perceived problem.  Cognitive dissonance is already very high.  That seems like how institutional racism operates.

Mutuality over preference
Obviously, my whole point is that we can’t determine mutuality given current preference systems/judge pools. A judge ranked 50 for two different teams has demonstrably different utility. Mutuality only is very conservative.

Have no extra rounds and have judges fill commitments:
This is not a bad idea, but I don't think it fixes too much.  If it is used to decrease the cost of the tournament and thereby decreasing fees for attending teams, this could be positive.  As far as diversity goes, it is really just shuffling judges at the margins.  The hard to place judges get placed in rounds that are perceived to not matter.  Often they are placed in JV or novice.  Certainly this is not nothing, but it is hard to measure.  It has a couple of perhaps negligible, but perceivable negative outcomes that I have seen in tab rooms before.  First, the judging is too tight. Judges want to need some specific rounds off.  Maybe we just say tough, but that is not always an option.  The tournament might slow down.  The fact that this has functioned reasonably well at a large national tournament might obscure the fact that it is more difficult to have a tight pool at a small tournament.  You can't have one judge not show up and ruin a tournament.  There needs to be flexibility.  Second, there are judges that teams perceive to be either wholly unfair or in some other way unacceptable in the back of the room.  Not because of argument choice, but probably due to some interpersonal problem.  I suppose that could be solved by conflicting the judge, but I am not sure how that works if half of the pool conflicts someone that they perceive to be inappropriate while judging.  This may be a concern for much smaller tournaments, but I think that it is a real problem.

The largest problem is that it does nothing to address the differences in marginal preferences. No team’s preferences would substantively change. It would also have the effect of making preferences harder to fill for teams that already have difficult preferences, because flexibility of the judge pool is good for everyone’s preferences, if we presume that preferences are a good idea. Any increase in demand for a judge’s labor trades-off with the supply. In an attempt to utilize the entire labor pool, this option constricts supply and thereby choice. The solution is in correcting the choice variable. If we act to increase the number of teams that are judged by the limited pool of judges from underrepresented groups, then you necessarily decrease the number of times a team that normally prefers that judge or group will get to have that judge.  This would be like fixing unemployment by culling the labor force.

This has the negative effect of limiting the income of some individuals in the community that are likely already working for at or below minimum wages.  It also might mean that judges that would have attended no longer attend because they are not needed to fill commitments.  In our community, that often falls to the coaches that are already marginalized financially or socially. Both these effects are regressive to the stated goals of judge diversity and retention of marginalized coaches.

It is also interesting to note the divergent value of the judging because we have not accurately valued the labor of a non-white-male judge.  We have not priced in the externality of non-diversity.  The aggregate demand is the sum of the demand for each of the teams based on the preference system.  However, the value (as noted from virtually every conversation on the issue) of diversity is very high.  But the aggregate demand curve does not reflect that.  We, then, let teams pass that cost onto the whole.  Of course, there is no way for the whole to absorb that cost financially, only materially in the lack of diversity.  The money to increase supply could be considered a tax corrective to internalize the externality of lack of diversity.  One possible solution would be for tournaments to place more value in the judge obligation system to judges that meet certain criteria. It would then give teams more incentives to hire judges that reflect those goals. If a black judge counted 2 rounds of obligation  for each round of commitment, it could start to change the dynamic in many ways.

Hire more diverse judges/coaches
This is assuredly a good idea. However, this is not a solution to preference problems.  There are two general market failures that are occurring with diverse judging.  We have both a supply limit and a demand shortage.  Of course, we aren’t talking about just some commodity, but a labor pool of people.  Any attempt to increase demand necessarily increases the burdens on the supply.  Since the demand pool is finite, i.e. we are not adding more rounds, the pressure on the “non-diverse” labor pool is decreased.  This is primarily a fairness issue.  If, however, we focus on supply, I think that we might find that it does not solve the problem alone, either.  If there is no demand, then you are just increasing the labor pool of people that can’t find work, kind of like increasing the age for Social Security benefits.  Therefore, I think that it is concomitant that we work on both sides of the problem.  Affirmative action, even if it works, puts more burdens on the minority judge, and that is a fairness issue.  Some would call that “benefitting” the minority judge, but not all the judges feel that way.  It really is a benefit to everyone (as the economy advances in diversity metrics), but a negative to the previously under-worked judge.  I think the frustration is that the supply of diverse judges has gone up recently, but the demand has not followed.  There were plenty of black and/or female judges available for the final round or the NDT panel, but there was no demand for their labor.  This solution is very long-term and requires a lot of feedback loops before it reaches goals that we are trying to achieve short-term. It is, however, a requirement in the long-term, so it should not be diminished.

Recruit more diverse students
This is obviously a good idea. However, it can’t be the solution to the short term problem of judge preference failure. The community has a serious problem with retention of under-represented groups.  Anecdotal evidence and experience in intercultural communication would suggest that the problem is likely related to the judge preference shortfalls. If we recruit more people who do not feel valued, then it is unlikely that we will end up with a bunch of people that stick around to be judges and/or coaches.

Larger panels
To try to have more underrepresented judges in elims, panels have been expanded to 5 judges.  It (arguably) enabled more underrepresented judges to be placed in the elims, but it has the side effect of diluting their impact on the round.  We could have 7 or 9 judge panels, everyone judges the elims…but the same people at the margins will continue to be marginalized by the content of the round and the lack of people that look like them judging. This is amplified by attempts to make sure that every panel has a diversity person on it.  This  diversity strategy both advances minority judge numerical placement in elimination rounds, while also potentially limiting the minority judge’s capacity to impact the decision by spreading the thin judging pool across each debate.  At most tournaments, the supply is too constrained to spread across debates without guaranteeing that under-represented judges are the minority in each panel.

It also increases labor on a pool by putting pressure on particular people to do more work, when they probably already perceive that they are not valued by the participants.

Finally, it has the problem of only affecting teams that have reached the elimination rounds. If the system is flawed in determining who is in elimination rounds, it hardly seems like the appropriate corrective to change the judging pool there.

People should become more familiar with judges/give more details in their judging philosophy/do a video judge philosophy
On the issue of more information/familiarity alluded to here, I also think that this can help, but only if you believe that the “other” ideology is legitimate.  The most likely result is that conservative (in debate practices terms) teams will find that most judges do not reject them out of hand.  White people will notice that most of the videos have a white person in it. Males, etc…  The presumption that more information smooths out the problems is also part of the marginalist utopia.  If we only know which product is best, we would pick that, after all, we are all rational.  The paradigms are a bit too much like marketing.  Commercials make total crap seem appealing.  We have judge philosophies, they don’t help all that much (and nobody seems to read them).  More self-disclosing judge philosophies would likely be similar.  We actually have a much better way of determining this, and it is with experience. 
I don’t see why this is not a great argument to radically diversify your preferences throughout the year to gather more accurate information on judges in preparation for the national tournaments.  This is why I think that teams that make large changes to their preferences to get people that they might not have before will benefit in the long-run.  As the preference system has become better at creating bubbles around teams, they lose some resiliency.  Bad judges or bad experiences should count, but we all might need to be very introspective of why we consider each judge or experience bad.  Familiarity can’t smooth everything out.

System Complexity, tradeoffs and definitions.
Evaluating the specifics of the supply/demand shortfalls of this labor pool is interesting and reveals the divergent results for different options. For some teams, the demand for certain judges outstrips supply.  That would mean that any increase in aggregate demand actually decreases supply/increases the demand shortfall for these teams.  Since, often, that group of students also is a demographic group we are trying to increase in participation and retain, we have to be attentive to that. If we artificially increase the demand for a judge in a debate where neither side would like them to judge, that may further some pedagogical goals. However, with the limited supply pool, that decreases the supply for the teams that do want that judge. It also might not make the judge feel more included in the community.

This has all kinds of implications for how you resolve the problem.  First, if we increase demand, we need a way to resolve the fairness issue.  That, of course, would change the intersection of the  aggregate supply and demand curves, and the acute supply for teams that already had high demands.  Those teams have a highly elastic demand curve, because at some point the judges that will give them what they perceive to be a fair shake flatlines.  Therefore, any policy that increases demand should also try to increase supply in some capacity. 

What this means to me is that we can’t just focus on one issue, we need programs that address all of the problems just within the bounded system of a tournament. 

Team-level preferences

If you have read the problems with the current system as I outlined and the problems with the enumerated solutions, you can probably see why a system that attempts to correct the demand-curve/marginalism failure is required. A system that changes the judge placement algorithm to (partially) account for the failure of the market at the team-level is a corrective that has many benefits. The change could be similar to adding 25% weight to the preferences of a team with an individual from an under-represented group.

This could be defined in many ways as any team with a black student, female, LGBT, Disability, etc, although I personally think that we should stick to a limited group of which there is significant evidence of structural discrimination in the activity. This is, honestly, the most difficult portion of this solution. If everyone gets a bump, it does nothing.

It fulfills larger diversity goals
While many judge diversity options are focused primarily on the placement of judges in rounds, this option is focused on correcting for the comfort and inclusiveness for the students in the round. This is not to minimize the goal of increasing judging by under-represented groups, but to focus that on the level of the team. At least some of the recruitment and retention of under-represented groups is likely attributable to the perception that they are not welcome. Because of supply limits and other complex variables, there is not really a system that can make all students get the judge that looks like them, has similar experiences, or thinks like them each round. It is also true that having all the available black judges in rounds does not necessarily make the black students in the activity feel more welcome. Seeing black judges on panels might not do much either in this regard. For the students, the in-round experience is likely a significant factor in whether they continue with the activity. Team-level affirmative action will make it more likely that the under-represented student is comfortable with the judge making the decision.

It will (probably) fix a great deal of the judge inclusion problem. This could be something that happens slowly as students are retained, hopefully, and coaches and judges are hired/retained. It should have the immediate benefit of increasing demand on the pool of underrepresented judges. The increase in demand should not skew the pool the way other demand increases do because the demand increases happen for teams that already actually prefer the judge as opposed to universal. Teams that do prefer under-represented judges are able to express that preference more effectively. The differences in the marginal utility of the judges to each team is thus diminished.

It would make gaming the system very difficult
While it would not eliminate gaming of the most obvious form, like placing all judges of a particular group at the bottom of the preference sheet, it would make it very difficult to deal with the rest of the judge pool, for which the preference-receiving team also has additional control over. The complexity would also now play against the gaming as it relates to teams within a team’s ideological in-group. Competitive gaming will persist, but its effects will be muted, more or less depending on the affirmative weight that you apply to the under-represented team.

There are no easy solutions, and each solution will likely generate some negative feedback loops. I would suggest that since this is a change to the algorithm, it could be tested more effectively in prelims by using this in 4 random rounds and not in 4 at a large tournament. I am sure others will have some method ideas for testing. I look forward to some feedback.

September 19, 2014, 08:37:57 AM

Very interesting.  My initial impression is that I like it. 

Two thoughts, neither of which should be read as dismissals of the proposal, but rather just as attempts to flesh it out a little more:

There's a risk that some might seek to game the system by falsely or disingenuously claiming minority status, either by outright lie or by appeal to that great great great grandfather who was Native American.  Hopefully, we can rely somewhat on coaches to manage this type of thing.  Would there be a large enough disad to making the identification public?  I could maybe imagine some things that people don't want to reveal, but if we're talking about correcting for structural discrimination that affects debate success then I would assume that their identity is somewhat public anyway.  I suppose maybe there are some with more internal or psychological issues that affect their preparation before they get to the debate but that the judge can't "see."

There is a problem of quantification.  The proposal mentions something like 25%.  I don't know enough about the software to say anything educated about that, so I'll assume that it's an appropriate number.  My question, though, is whether that number should be the same across groups.  Do black debaters have the same level of structural disadvantage as LGBT debaters, etc?  While I can imagine that it would be really tacky to try to say that black = 25% and woman = 17.5% and trans = 21.3% and black woman = 31% and so on, I also wonder if keeping it the same for all groups would a) adequately address different positionalities, and b) be satisfactory to those affected.
September 19, 2014, 10:31:14 AM

Would it work to have the weighting percentage adjusted not based on "how oppressed" we perceive the group to be in some abstract sense, but instead on how marginalized that group is in the overall prefs?  So if black judges are more marginalized than woman judges, then the weighting of black prefs would be higher?  

I suppose that introduces some logistical problems, like relying on judges to self identify and groups that are too small to provide a good data set.  It also maybe assumes too much of a 1 to 1 identity correspondence between judge and debater.
September 19, 2014, 12:37:40 PM

Instead of saying LGBT students can we say "LGB" and "transgender"? I'm a transgender student that debated for four years and never had a single openly trans judge in the back, but had plenty of LGB judges. Putting them into one category makes it seem like they need to be addressed equally while in reality there are huge differences in structural barriers for these individuals' inclusion in debate. Gender segregation of hotel rooms, ensured access to gender neutral bathrooms at debate tournaments, general attitudes towards pronouns and their significance and most importantly debate community drama and shade are problems that made my experience in debate a lot more difficult than any form of discrimination I've ever faced for being sexually atypical.
kevin kuswa
September 19, 2014, 03:03:10 PM

thanks for all this work and thought, matt.  reading through your proposal with great interest, so glad the community is really thinking hard about these tough questions.  kevin
September 23, 2014, 06:46:59 AM

Oi.  We get hundreds of posts on some of the most trivial things in the facebook group, but here's a serious and well-conceived post, and ... crickets.
September 23, 2014, 06:48:57 AM

Maybe I'm being unfair.  GSU happened over the weekend, and people were probably just busy.
September 23, 2014, 10:56:47 AM

It’s easy to understand why Vega’s proposal hasn’t been widely discussed. There was a similar discussion on our distinct listserv last year, and many felt it wasn’t very easy to follow. Plus, MPJ functions to manage tension in NDTCEDA debate, so talking about it without bringing that tension to the table is difficult.

The things I like most about Vega’s proposal are:
a.   The acknowledgement that ideology (specifically debate ideology) is an important factor in this discussion. Based on the sum total of my discussions with people across the spectrum, I think it is the #1 factor determining how people will react to specific proposals such as this one.
b.   The realistic assumption that each team’s preferences are designed to maximize the probability of victory in the near term, by getting a judge who is fairly open to whatever strategies given teams consider to be their strengths.
c.   The acknowledgement that the still ambiguous parts of the solution are the most important ones.
d.   The suggestion of video recorded judging philosophies. I would love to see interviews with unfamiliar judges about any number of important questions which might impact their perspective on debates. I remain convinced that unfamiliarity is a large factor, and most written philosophies are less useful than watching even one RFD. Thus, it may be that teams are using preferences sub-optimally because they are unfamiliar. If they are not, please see item “a” on the next list.
e.   Vega himself. Really smart person with a good heart.

My hesitations include the following:
a.   Most debate rounds are between two teams with unequal performances and reputation before the debate begins. An additional pref boost for the team who started with the upper hand (given prior performances) seems problematic. I would be far more open to this sort of proposal if it was weighed inversely from an individual team’s recent success. Recent success seems a good way to assess if an individual team is currently (not just historically) marginalized in NDTCEDA debate.
b.   I have no idea how we measure if the goal has been met or not. This was posted on the heels of the very first regular season tournament to fall under the new CEDA Amendment, which was also supposed to address this problem. I have not seen data on whether or not it has, so far. This proposal emphasizes student diversity directly (and judge diversity only indirectly), but most descriptions of the problem suggest diversity is a bigger challenge in the judging pool than the student population. The definition of the problem isn’t stable.
c.   The assumption that mutually is always conservative. Judging will always seem skewed from either side of the ideological struggle due to the parallax effect. The mutuality midpoint of NDTCEDA seems anti-conservative relative to EITHER the larger society OR even all types of debate. How much more anti-conservative should we be trying to move? One nexus question is whether there are more judges “quite unlikely” to vote for a planless aff than there are judges “quite unlikely” to reject a planless aff on a standalone “gotta have a plan” strategy. Probably true 5 years ago, but today I’m not convinced.

I teach today, so apologies in advance if I'm slow to reply.
September 23, 2014, 01:49:19 PM

There were some good responses on Facebook. Gary noted that it is difficult to just make these things happen because there are not many people that write debate judging placement algorithms. I still think that we should talk about what we would ideally like it to do, and I think that it might be possible to make it happen in the long term if we ask nice to Bruschke.

Rashad also noted some different goals such as having all teams have a diversity of judges. That would change hiring and recruitment practices. You can read more from him on FB. I tend to agree, but I think that it is more important to give marginalized teams judges that they prefer versus spreading that small judging pool thin. These goals are, of course, convergent at different times. I think my proposal does address this to some extent, and does meet the goal of changing the hiring and recruitment patterns.

I would like to highlight a couple of points. Absent this change, I think it would be incorrect to describe any part of preference as "mutual" in anything but name. So any move to use mutuality or to broaden back to categories just misses the point. However, once you do make some correction, we can start to talk about broadening the base of judging. If people think that it is pedagogically important to have debaters judged by people outside of their comfort-zone (not safety-zone) then you can expand the range of judges, but you can't pretend that it is equally uncomfort-zone unless you make the weight adjustment. As Devon noted on FB, there are judges that people don't feel safe with. This would not change with the weighting. All this is trying to do is treat judges with roughly equal utility for each team as equal.

Also, this proposal is an unequivocal support of mutual preference judging. I feel like most coaches and debaters want this to continue. We expect and demand that tournaments give us judges that typically stay in our top 60% or better. This is obviously an arbitrary number, but it is important to consider what characteristics the judges below 60%. We obviously think that some judges are better judges. Some judges below that line are perceived as unfair or unqualified to judge. I think it is patently obvious that this is perceptually different for different teams. If we inflated the judge pool by adding 100 lay judges at tournaments, teams would not still think that 60% was fair, they would want top 30%. The point is that we are already making decisions based on ideology, we just mask it in the language of neutrality.
September 23, 2014, 02:27:35 PM

It’s easy to understand why Vega’s proposal hasn’t been widely discussed. There was a similar discussion on our distinct listserv last year, and many felt it wasn’t very easy to follow. Plus, MPJ functions to manage tension in NDTCEDA debate, so talking about it without bringing that tension to the table is difficult.

Sorry about the confusion of the proposal. I think that we deal with way more complex issues, though. It is probably just my confusing writing style or something.

The fact that MPJ manages tension in ndtceda is, i think, granted by my post. I am suggesting that its management is biased. MPJ is good, but this is not MPJ because the preferences are not mutual.

My hesitations include the following:
a.   Most debate rounds are between two teams with unequal performances and reputation before the debate begins. An additional pref boost for the team who started with the upper hand (given prior performances) seems problematic. I would be far more open to this sort of proposal if it was weighed inversely from an individual team’s recent success. Recent success seems a good way to assess if an individual team is currently (not just historically) marginalized in NDTCEDA debate.
This point strangely equates not being successful (losing) as being marginalized. While this can sometimes be a result of being marginalized, we should not confuse losing with being historically under-represented. Some teams are really bad at debate. Some teams are really good. Across the spectrum, it would be completely wrong to say that losing is marginalization.

b.   I have no idea how we measure if the goal has been met or not. This was posted on the heels of the very first regular season tournament to fall under the new CEDA Amendment, which was also supposed to address this problem. I have not seen data on whether or not it has, so far. This proposal emphasizes student diversity directly (and judge diversity only indirectly), but most descriptions of the problem suggest diversity is a bigger challenge in the judging pool than the student population. The definition of the problem isn’t stable.

The problem is not solved. The CEDA amendment didn't require anything specific, and there is no control, so I am guessing that it would be difficult to measure. Suffice it to say that there are not representative numbers of women and minorities in judging or debating. I think that most descriptions of the problem miss a great portion of the problem. AND, identifying the problem as only the way the judges are treated conflicts with goals that I think are equally as important. The diversity in the judging pool is virtually static. Recruitment and retention of students and progressing them to graduate schools and coaching is essential to change that judging pool. How we measure it in the grand scope is a question that we would be lucky to have to answer, but we aren't really all that close.

c.   The assumption that mutually is always conservative. Judging will always seem skewed from either side of the ideological struggle due to the parallax effect. The mutuality midpoint of NDTCEDA seems anti-conservative relative to EITHER the larger society OR even all types of debate. How much more anti-conservative should we be trying to move? One nexus question is whether there are more judges “quite unlikely” to vote for a planless aff than there are judges “quite unlikely” to reject a planless aff on a standalone “gotta have a plan” strategy. Probably true 5 years ago, but today I’m not convinced.
I use the term "conservative" within the scope of debate practices, not concerning american politics. Being on the liberal side of the US populace probably puts us just to the right of reasonable.) It would be very difficult for the judging pool who is trained in debate traditionally to not be conservative vis-a-vis debate norms. Some people are more likely to be introspective about that than others. I do think that your measure here about plan debate is not the same as increasing groups of students and judges/coaches from underrepresented populations. I am quite sure that there are many debaters from traditionally under-represented populations that would like to debate in a litany of different ways including having a plan. This is to miss the point that the proposal is not for "non-traditional" debaters to get preferences. There is virtually no data to suggest that we have fixed diversity in debate.

Thanks for the input. I will try to work on a less confusing way to write about this.

Typing on my phone, so I hope that makes sense.
