Research Summary:
• User modeling (user personal data, cyberaggression, Q&A platforms, web browsing, etc.), user online privacy, data mining, analysis and machine learning on large-scale data
• Privacy-preserving Machine Learning
• Streaming graph analysis, workload analysis on distributed stream processing engines, real-time-bidding advertising analysis
• Design, implementation and testing of large-scale, socially-aware distributed systems
Technical summary:
• Big data mining, analysis and machine learning on large activity corpus logs: social networks, RTB advertising, http logs from web browsing, stocks, web clicks, search queries.
• Data & system performance analysis, testing and simulations/emulations on large-scale distributed systems
• Experimentation on large-scale distributed platforms: LAN clusters, PlanetLab, Hadoop, Samoa, Storm
• Streaming graph analysis. MapReduce, Custom scripts, JUNG, SNAP, R, NetworkX, Gephi, UCINET.
• Java, shell scripting (SH, CSH, AWK), Hive, Pig, Matlab, SQL, C/C++
• Latex, Gnuplot, Eclipse, OmniGraffle, MS Office