Information Access Seminar

Student Progress Reports

Friday, October 21, 2022
3:10 pm to 5:00 pm PDT

Arogya Koirala, Shai Dhaliwal, Calvin Lee, Alan Kyle, Sarah Barrington, Ameya Naik, and Siddharth Adelkar

Monitoring War Destruction from Space Using Machine Learning

Arogya Koirala

Extracting information on war related destruction is difficult because it relies on eyewitness accounts and manual detection on the field, which is not only costly for institutions carrying out these efforts, but also unsafe for the individuals carrying out this task. The information gathered is also incomplete, which makes it difficult for use in media reporting, carrying out humanitarian relief, understanding human rights violations, or academic reporting. This seminar introduces an automated approach to measure destruction in war damaged buildings by applying deep learning in publicly available satellite imagery. We adapt different neural network architectures and make them applicable for the building damage detection use case. As a proof of concept, we apply this method to the Syrian Civil War to reconstruct war destruction in these regions over time. In the last talk we outlined the problem space, and talked about different data-related considerations to keep in mind when approaching the problem using machine learning. For this talk, we will take a closer look at the data, introduce the different machine learning architectures that we will employ, and (if time permits) discuss the potential benefits and drawbacks of these choices as they relate to our goal of identifying war-related building destruction.

Modernizing Mainframe RACF

Shai Dhaliwal

During this seminar, I plan to explore how cloud modernization will improve cyber security for legacy mainframe information systems, focusing on research progress made for below technical areas:

  • Structuring Mainframe Information, Metadata, Databases
  • Tokenization, Privacy, and Standards, and
  • Time permitting: RACF User Migration to the Cloud & Benefits.

Evaluating Consumer Robocall Mitigations

Calvin Lee

With the all-time high in October 4.5B robocalls were made across the US. We will evaluate the most and least effective mitigations as to date. We will provide an update and review the latest policies in how the Federal Communications Commission is placing new obligations for gateway providers to play a more active role to curb this abuse.

Artificial Intelligence and Machine Learning Fairness

Alan Kyle

In this presentation I will share my progress in formulating my project on machine learning fairness toolkits. As a part of a large team, my contribution will focus on the policy aspects of these tools. Questions to be answered are: How are fairness toolkits used? What are the current AI policies that practitioners need to be advised on? And, how can organizations promote fairness practices through their internal policies?

English Documentation of Non-English Stories: A Study of the People's Archive of Rural India

Siddharth Adelkar

In understanding what information is lost or gained in English-first documentation of Indian conditions, I investigate the discourse in Indian English and its closeness to other Indian languages. I focus on four dimensions of language viz. phonology, lexicon, syntax but most importantly, what is being said — the contexts. I consider the verbal and written communications of Indian English writers who are native Marathi speakers, and compare the distance between their English and Marathi. If Indian English is closer to Indian languages than to, say English (UK) and if the distance between them is comparable to that between native dialects of English then it would be fair to say that Indian English is only as good or bad in documenting local conditions as its native languages.

The Fungibility of Non-Fungible Tokens: A Quantitative Analysis of ERC-721 Metadata

Sarah Barrington

Non-Fungible Tokens (NFTs), digital certificates of ownership for virtual art, have until recently been traded on a highly lucrative and speculative market. Yet, an emergence of misconceptions, along with a sustained market downtime, are calling the value of NFTs into question. This project (1) describes three properties that any valuable NFT should possess (permanence, immutability and uniqueness), (2) creates a quantitative summary of permanence as an initial criteria, and (3) tests our measures on 6 months of NFTs on the Ethereum blockchain, finding 45% of ERC721 tokens in our corpus do not satisfy this initial criterion. Our work could help buyers and marketplaces identify and warn users against purchasing NFTs that may be overvalued.

Assessing “Data” in Data Subject Access Requests

Ameya Naik

Any mobile or desktop application prompts you to sign a term of policy agreement even before using it. This term of policy agreement allows the application to gather certain information related to you, your activities, and your attributes. The level and type of information gathered depend on the organization, business model, kind of application, and geographical location. EU Data protection and CCPA grants the consumer the right to personal data the company holds on them. While GDPR (Article 15) and CCPA have broadly drawn rules and regulations for Data Subject Access Requests (DSAR), however, there still are differences in the way the data is stored and shared with the consumers (you). We have initiated DSARs for certain mobile applications, which are categorized in similar categories, taking due notes of the process, and comparing the data and visualzing the data.

This seminar will be held both online & in person. You are welcome to join us either in South Hall or via Zoom.

For online participants

Online participants must have a Zoom account and be logged in. Sign up for your free account here. If this is your first time using Zoom, please allow a few extra minutes to download and install the browser plugin or mobile app.

Join the seminar online

Last updated:

October 19, 2022