Team 2 at Masters of Networks 2 investigated the pattern of allocation of research funding in Italy, using official data from the Italian Education Department. They had rather intriguing questions: do research institutions form semi-stable coalitions to scoop up the funding? What role do external consultants play? Are they also part of the coalitions, if there are any?
Team 2 had a healthy mix of policy makers (mostly from the Education Department and the Italian Treasury) and network and data scientists, led by INRIA’s Guy Melançon and University of Bologna’s Matteo Fortini. Their conclusion:
These intense two days of collaborative work did convince the group of the potentialities of the SNA. One by-product clearly was to refine the questions policy makers originally had, in light of what the data as able to uncover.
Read the whole paper that came out of the session, it’s well worth it.
This is a writeup of the Team 1 hackathon at Masters of Networks 2. Participants were: Benjamin Renoust, Khatuna Sandroshvili, Luca Mearelli, Federico Bo, Gaia Marcus, Kei Kreutler, Jonne Catshoek and myself. I promise you it was great fun!
We would like to learn whether groups of users in Edgeryders are self-organizing in specialized conversations, in which (a) people gravitate towards one or two topics, rather than spreading their participation effort across all topics, and (b) the people that gravitate towards a certain topic also gravitate towards each other.
Understanding social network dynamics and learning to see the pattern of their infrastructure can become a useful tool for policy makers to rethink the way policies are developed and implemented. Furthermore, it could ensure that policies reflect both needs and possible solutions put forward by people themselves. The ability to decode linkages between members of social networks based on the areas of their specialization can allow decision makers and development organisations to:
Compared to traditional models of policy development, this method can allow for more effective and accountable policy interventions. Rather than spending considerable resources on developing a knowledge base and building new communities around a policy theme, the methodology would enable decision makers and development organisations alike to tap into available knowledge bases and to work with these existing networks of interested specialists, saving time and resources. Moreover, pre-existing networks of specialists are expected to be more sustainable as a resource of information and collective action than ad-hoc networks built around emerging policy issues.
Edgeryders is a project rolled out by the Council of Europe and the European Commission in late 2011. Its goal was to generate a proposal for the reform of European youth policy that encoded the point of view of youth themselves. This was done by launching an open conversation on an online platform (more information).
The conversation was hosted on a Drupal 6 platform. Using a Drupal module called Views Datasource, we exported three JSON files encoding respectively information about users; posts; and comments.
These data are sufficient to build the social network of the conversation. In it, users represent nodes; comments represent edges. Anna and Bob are connected by an edge if Anna has written at least one comment to a piece of content authored by Bob. We used a Python script with the Tulip library for network analysis to build the graph and analyze it. The result was a network with 260 active people and about 1600 directed edges, encoding about 4000 comments.
To move towards our goal, we needed to enrich this dataset with extra information concerning the semantics of that conversation (see below).
To define to which degree people gravitate towards certain topics, and towards each other, we carried out “entanglement analysis” on a dataset containing all conversations carried out between members of the Edgeryders network. Entanglement analysis was proposed by Benjamin Renoust in 2013; we performed it using a program called Data Detangler (accessible at http://tulipposy.labri.fr:31497/).
These data can be interpreted as a social network: people write posts and comment on them; moreover, they can comment other people’s comments. Within this dataset, each comment can be interpreted as an edge, connecting the author of the comment to the author of the post or comment she is commenting on. Alternatively, we could interpret them as a bipartite network that connects people to content: comments are edges that connect their authors to the unit of content they are commenting.
Each of the posts written on Edgeryders is a response to set briefs, or missons, that sit under higher level campaigns. This means that many posts – and associated comments – live under the higher level ‘topic’ of one of nine campaigns.
In order to understand how the various topics and briefs connect to each other we analysed the keywords that defined each mission/brief. This was carried out by manually analysing the significance of word frequency for each post. Word Frequency was asceratained by using the in-browser software http://tagcrowd.com/faq.html#whatis to work out the top 12-15 words per mission. We then manually verified these words (removing, for example names, or words that were too general, or that were a function of the Edgeryders platform itself- e.g. ‘comment’ or ‘add post’).
The combination of these three elements gives us a multiplex social network, that is indexed by keywords. A multiplex social network is one where there are multiple relations among the same set of actor.
We dropped edges that are linked to only one brief. These are edges of ‘degenerate specialists’; as they only interact in the context of one brief, they are specialists only by default.
At this point, we had a multiplex social network of users and keywords. Users were connected by edges carrying different keywords – indeed, each keyword can be seen as a “layer” of the multiplex network, inducing its own social network: the network of the conversation about employment, the network of the conversation about education etc. Many of the interactions going on are non-specialized; the same two users talk of several different things. In order to isolate specialized conversation, for each individual edge of the multiplex we remove all keywords except those that appear in all interactions between these two users. In other words, we rebuild the network by assigning to each edge the intersection of the sets of keywords encoded in each of the individual interactions. In many cases, the intersection is empty: it only takes two interactions happening in the context of two briefs with no keywords in common for this to happen. In this case, the edge is dropped altogether.
A nice side-effect of 4 and 5 is to greatly reduce the influence of the Edgeryders team of moderators on the results. Moderators are among the most active users; while this is as it should be, they tend to “skew” the behaviour of the online community. However, 4 removes all the one-off interactions they tend to have with users that are not very active; and 5 removes all the edges connecting moderators to each other, because they – by virtue of being very active – interact with one another across many different briefs, and as a result the intersection of keywords across all their interactions tends to be zero.
We then identified groups of specialists by identifying those users interacting together solely around a small number of keywords (e.g. in example, n(keywords) = 2).
The method does indeed seem to be able to identify groups of specialists. “Groups” is used here in the social sense of a collection of people that not only write content related to the keywords, but interact with one another in doing so – this is to capture the collective intelligence dimension of large scale conversations. Figure 1 shows some conversations between people (highlighted on the left) that only interact on the “education” and “learning” keywords (shown on the right). Highlighted individuals that are not connected to any highlighted edges are users who do write contributions that are related to those keywords, but are not part to specialized interactions on those keywords.
Once a group of specialists is identified, the next step is to look for the keywords that co-occur on the edges connecting them. An example of this is Figure 2, that shows the keywords co-occurring on the edges of the conversations involving our specialist group on education and learning. The size of the edge on the right part of the figure indicated that keyword’s contribution to entanglement, i.e. to making that group of keywords a cohesive one. Unsurprisingly, “education” and “learning” are among the most important ones. More interestingly, there is another keyword that seems to be deeply entangled with these two: it is “open”. We can interpret this as follows: specialized interaction on education and learning is deeply entangled with the notion of “open”. The education specialists in this community think that openness is important when talking about education.
This method is clearly scalable. It can be used to identify “surprising” patterns of entanglement, which can then be further investigated by qualitative research.
The main problem with our method was that is is quite sensitive to the coding by keyword. Assigning the keywords was done by way of a quick hack based on occurrency count. This method should work much better with proper ethnographic coding. Note that folksonomies (unstructured tagging) typically won’t work, as it will introduce a lot of noise in the system (for example, with no stemming you get a lot of false (“degenerate”) specialist.)
Masters of Networks is essentially a hackathon. There will be no talks except a very short introduction by me. While hackathons typically organize themselves given good wi-fi and enough caffeine, we thought we would give it a modicum of structure. It works like this:
There will be two teams. Each is manned by at least one policy maker with a burning question; one network scientist; one developer who can hack around code on the fly; and – ideally, one statistician to secure whatever statistical analysis we might need to do. Each is equipped with one or more datasets. Here are the core teams:
You show up at 10.00. On Wednesday 9th, I (Alberto) will give a warmup presentation on what it means for policy makers to think in networks. Then we tackle the questions, and try to get to some answers by using network science and code. We add coffee as appropriate. It’s that simple. Maps and more practical info here.
We build teams just to save time. Pick the one you like best and get your hands dirty. There is plenty of room for everyone.
We still have one or two places. Contact alberto [at] cottica [dot] net
Designing for emergent effects in social dynamics may seem a contradiction in terms, and – if left unqualified – it is. And yet it is a very tempting goal for INSITErs, as we progress on our quest to imbuing innovation activities with social values that would make any changes resulting from such activities “good”, for some value of good. Just one year ago, by initiative of the University of Alicante group, we gathered an unusual bunch of policy makers and network scientists to look at public policy issues – and the data trail they leave to look at this very question. We called this gathering Masters of Networks.
I like to think of MoN as a success. I enjoyed it immensely; more importantly, it spawned a collaboration between the University of Bordeaux group (Benjamin Renoust in particular) and the World Bank, brokered by UNDP’s Millie Begovic. Millie also came up with a precious testimony of just how fruitful such diverse collaboration can be. Granted, dialog across our different languages was not always easy, but this is part and parcel of interdisciplinarity.
Since this approach seems to have worked, we are doing the obvious thing: iterating. Masters of Networks 2 takes place in Rome, at The Hub Roma, on April 9th and 10th 2014. Think of it as an interdisciplinary hackathon around network science, with policy makers to ask relevant question, network scientists to help model it, data scientists to crunch the data and policy makers again to interpret the results. We will be working on two issues, in parallel:
For now, we have confirmed the presence of:
More invitations are in the loop; more importantly, everyone is welcome: the event is completely open. If you are interested in public policies, networks science, data science; if you think you would enjoy an interdisciplinary public policy hackathon-ish, Masters of Networks is the place for you. It is also free of charge, though we will have to cap the number of attendees to about 20 people. If you want to attend, just drop me a line of email at alberto [at] cottica [dot] net. We might even be able to help you if you wish to attend and need support! How cool is that? More detailed information to be released as they come. Are you ready?
In this final runup to Masters of Networks – coming up next week – we have made a lot of progress. First of all, we have assembled and stellar and diverse lineup: policy makers, network scientists, data analysts and programmers from all over the place and all walks of life. Meet the Masters at this page. Second, we have agreed on the policy questions that we want to tackle, and turned them into what we call worktracks. And finally, we have assigned a “core team” to each worktrack to make sure each question receives due attention. People not assigned to a core team are free to wander as they choose. In the final session, we’ll all present to each other. All is left is to go in and just do it! Here is a list of worktracks:
WT1. Is it who you know? We test for evidence of brokerage (rather than proposal quality) in access to EU regional development grants or other sources of public funding. Read more. Question masters: Tito Bianchi, Millie Begovic, Stefano Bertolo. Network scientists: Bruno Pinaud, Marie-Luce Viaud. Data Analyst: Michele Pezzoni. Programmer: Benjamin Renoust.
WT2. Designing scaling into online collaboration. We look for signs that management decisions in a recent public consultation have led to scalability, in the form of the emergence of specialized subcommunities. We’ve got data here! Read more. Question masters: Alberto Cottica and Marco Bani. Network scientists: Fernando Vega-Redondo, Guy Melançon. Data Analyst: Raffaele Miniaci. Programmer: Dario Bottazzi.
WT3. Tracking a democratic conversation across different online media. If we accept a description of Social Media as being a ‘Networked Public’ then understanding the networks that make up the informal civic conversation around either a topic or a geography is vital to ensure this more open contribution. Read more. Question masters: Catherine Howe and Ade Adewunmi. Network scientists: Matteo Fortini, Gaia Marcus
WT4. Managing diversity in social networks or Organizational adaptation to the threat of exit of key members. Question master: Sergio Currarini. Network scientists: Michele Pezzoni, Marco Bani.
The event is fully booked: we can’t accept any latecomers, sorry! But we will blog and let you know about it.
Tito Bianchi – an economist working for the Italian ministry of economic development – wants Masters of Networks to look into EU regional development policies. Tito’s unit focuses on the Italian less-developed Mezzogiorno, an areas that shares some features with developing countries: and that includes traditionally opaque mechanisms for allocating public sector grants. In his own words:
I would like to test the validity of statements that are often made with regard to my policy domain of interest – the EU regional development policies – that funds are often granted more based on personal connections, sometimes degenerating in outright clientelism or corruption, than on project quality. Formally, most projects are funded after a competitive selection process that should reward project quality consisting in some kind of positive externality or dynamic social economic effect. This process should be explicit and transparent in almost the totality of the cases of subsidies to private firm investments. However, in the public discourse it is often heard the voice of those who argue that the quality of project proposals is less important than reputation or connections of the applicant: “it is always the same people who have access to the funds, it’s who you know..”.
My hypothesis is that, while cases of malfeasance and inappropriate behavior of institutions in charge of funds’ management exist, especially in underdeveloped areas the public the public perception of the incidence of these cases exceeds the real proportions of the phenomenon. The reason why the perception of clientelism may be greater than it’s real magnitude is twofold:
- most projects are filed in with the help of consultants. Those consultants/intermediaries have skills that are fungible: they consist in knowledge of the procedures through which funds are awarded, how to fill the forms, what the regulations prescribe, etc… In competing with each others’ for clients who want to submit investment proposals for funding, these consultants have an incentive to induce them to think that funds are awarded based on personal connections and that they possess the right connections. The view that the selection process is fair and competitive reduces their market power.
- the losers of competitive selection processes have an incentive in spreading the view that this process is unfair, to account for their lack of success.
This is no small issue and goes at the root of the development process itself. In underdeveloped regions the reputation of government policies for being corrupt discourages effort and investment, thus perpetuating economic and social backwardness. If the system is known to be corrupt and rewards rent-seeking more than the pursuit of new projects, I would rather put my effort in unproductive activities than in those that produce wealth and social benefits. Conversely, if we were able to demonstrate that unfair allocation of public resources is the exception and not the rule, well-motivated people would be more incentivized to act and compete for public subsidies and rent-seekers less inclined to spend their time trying to bend public decision in the direction of their own interest.
The database of ALL regional development policy projects in Italy (2007-2013) is searchable at: www.opencoesione.it, and fully downloadable at: http://www.dps.tesoro.it/opencoesione/ml_en.asp. I hope to see many of you in Venice!
Masters of Networks is a workshop that brings together cutting-edge policy makers and network scientists. We aim to come up with a specification in terms of networks of some public policy problems, and a viable strategy to address them in new ways. Information and registration here.