I went to 6 sessions during the conference: Ad Auctions; Industry, Recommending content and matchmaking; Security (x2); and Privacy (x2). A session is just a group of talks that are all in the same room one after another. Most sessions had 4 talks in a 90-minute slot. Here’s what I took away from each of them …
Other conference recap posts cover:
- Keynote talks.
- How my talk went, and what I learned about giving better presentations or asking better questions at conference talks.
- Fun things we did.
Ad Auctions: This was a session on the algorithms that advertising platforms use to determine how much to charge for an ad. While advertising is very relevant to security and privacy research, these talks were too low-level for me to get much from. However, the third talk in the session, GSP — The Cinderella of Mechanism Design, was (I thought) more approachable, and I learned a few things. The argument was that they could use a different type of algorithm – a value maximizer – for ad auctions. They said this was possible because, while you might expect companies to try to use advertisements to maximize profits, they actually try to maximize revenue (meaning they’re willing to spend more on ads even if the net profit is lower). This is, they said, because advertisers fall into 3 main categories: brand advertisers (e.g., Coke), who aren’t using any one ad to sell a product but rather to get name recognition; ad agencies, who get paid more if they spend more of their client’s money; and performance advertisers (e.g., Lending Tree), who really want to track each person who clicks their ads and find out if they follow through (e.g., on getting the loan). I still don’t get the intuition for why performance advertisers don’t maximize profit, but it sounded like there was empirical evidence to back it up. This was definitely not the meat of this research, but it was something I didn’t know about.
Industry, Recommending content and matchmaking: I only went to one talk in this session, Analyzing Uber’s Ride-sharing Economy. I went to this mainly because the data used to do this analysis came from Uber receipts sent in Yahoo emails, similar to the keynote speech; however, the abstract mentioned that they analyzed how ride cost, retention, and ratings varies with demographic information such as age, income, and race. This seemed like such an invasive amount of information for researchers to have access to (after Yoelle Maarek went to such lengths to make it clear that they were conscientious about privacy concerns). It sounds like much of the demographic information (e.g., race and income) were derived from zip codes, which helps a lot … Anyway, they found, among other things, that drivers who make a bigger portion of their income during “surge pricing” have lower ratings and female riders give female drivers worse ratings but there’s no difference for male riders. It was also useful to be reminded of how big companies think about “progress.” For example, they measure the “health of the system” in terms of “churn” (rate of attrition) and want riders and drivers to be “more engaged with the platform.”
- Who Controls the Internet? Analyzing Global Threats using Property Graph Traversals, did a “what if?” worst-case-scenario analysis of which countries or entities (e.g., big companies like Google) control the most servers and could launch attacks on the others.
- Next, Tools for Automated Analysis of Cybercriminal Markets presented tools that can analyze conversations about cybercrime in online forums (e.g., buying and selling illegal or stolen goods or services) and results of their analysis of these forums. It was an awesome glimpse into the world of cybercrime, and I plan to read the paper. I’m less interested in the tools than the analysis, but many of the examples she gave of what makes automated analysis hard involved funny misspellings or slang terms. Someone asked if newbies on these forums use more correct spelling/grammar (e.g., if the veterans have learned those techniques to evade detection); so maybe I was on to something with the coc8ne post a while ago.
- Tracking Phishing Attacks Over Time, is another one I want to read later – it wasn’t the type of analysis I expected based on the title, and my jet-lagged brain wasn’t able to catch up.
- Finally, Security Challenges in an Increasingly Tangled Web raised questions about what can go wrong when websites don’t know what resources (e.g., libraries) they load and rely on, and how to start fixing the problem.
Security 2: Honestly, these talks were much more low-level, which isn’t my research interest, and this was the end of the day, so I didn’t get much out of these technically. I did make some notes about how the presenters did (or did not) manage to make the talks approachable to an audience outside of their immediate research areas, which is something I generally aim for.
- My talk was first.
- Then, a talk showing it is possible to de-anonymize and predict locations from aggregated/”anonymized” location data: Trajectory Recovery From Ash: User Privacy Is NOT Preserved in Aggregated Mobility Data.
- Next, a paper/talk about tracking in TOR (which is supposed to be anonymous/not track you): The Onions Have Eyes: A Comprehensive Structure and Privacy Analysis of Tor Hidden Services.
- Finally, Deanonymizing Web Browsing Data With Social Networks showed that given anonymized browsing history of someone who has a (public) Twitter account, they could use Twitter to figure out who the browsing history belongs to, even if you never post anything to Twitter. They were able to do this based on an intuition that the set of links any one person sees on Twitter is fairly unique. So if the browsing history includes enough of the links that someone would have seen in their Twitter feed, it’s likely their browsing history. Interestingly, if you follow a bunch of people who post a ton of links, your browsing history would be more anonymous because there are so many links and you likely don’t click a significant enough portion of the links for their algorithm to work. They also used an interesting methodology that I’d like to learn more about, of getting people to voluntarily donate a month’s worth of browsing history.
- Pinning Down Abuse on Google Maps discussed how people commit fraud in Google Maps. For example, in some cities, most of the locksmiths you’ll find in Google Maps don’t (or didn’t – many of these problems have presumably been fixed since the research was done) have an actual physical location and are pretty seedy. This paper discusses what types of fraud are present in maps, in what quantities, how that varies by location, and how people commit the fraud in the first place.
- Extended Tracking Powers: Measuring the Privacy Diffusion Enabled by Browser Extensions was a talk my brain wasn’t mentally ready for, but I had an interesting conversation with the author the day before, and it’s work that builds on work done in our lab by Ada Lerner and Anna Simpson, Internet Jones and the Raiders of the Lost Trackers.
- Security Implications of Redirection Trail in Popular Websites Worldwide asked if a.com and http://www.a.com and http://a.com and https://a.com all go to the same place and, if not, why, and what the implications are for security/privacy.
- I’m super excited about the last talk we saw, Some Recipes Can Do More Than Spoil Your Appetite: Analyzing the Security and Privacy Risks of IFTTT Recipes. I had a long lunch conversation about this and other possibilities for IFTTT follow-on work, but this paper analyzes a dataset that includes almost 20,000 IFTTT “recipes” and finds that around half of the recipes violate secrecy or integrity rules. Secrecy rules are violated when a private event (e.g., being mentioned in a Facebook post) triggers information to be shared publicly. Integrity rules are violated when a public or publicly control-able event (like a post with a certain keyword making it to the front page of Reddit) triggers a private event (like your unlocking a smart-lock door).
Other papers I want to read (but didn’t attend talks for):
- Can you spot the fakes? On the limitations of user feedback in online social networks
- Detecting Collusive Spamming Activities in Community Question Answering
- A Selfie is Worth a Thousand Words: Mining Personal Patterns behind User Selfie-posting Behaviors
- … and probably several others. This conference had a ton of papers, and since security can apply to anything, it can be hard to predict what work will point toward interesting security/privacy challenges.