TUTORIALS (TENTATIVE)

Joseph Barr, Trust Science, Canada
Title: A tutorial with R

 

Abstract: The body of knowledge we call machine learning consists of multitude of statistical and heuristic algorithm where data comes in on one side and data which we deem "actionable" "insight" comes out the other.  I will focus on a handful of case studies and will demonstrate how those "my favorite" algorithms can solve business problems.

Bio: Dr. Joseph ("Joe") Barr is a board member of Trust Science (trustscience.com), a "Next-Gen" credit bureau or "Bureau 2.0".  Joe helps with strategic initiatives, capital raise and other data-related initiatives. Joe's past roles include Chief Analytics Officer at Home Union (www.homeunion.com) where he led a team of data scientists to develop price/rent valuation (AVM) models, a real-time pricing product "The Right Bid", neighborhood investment risk score "NIR", a risk-adjusted ROI "RealEstimate", and a unique home price index "HU HPI". Prior to that Joe was the Chief Data Scientist at ID Analytics (acquired by Lifelock, which was then acquired by Symantec after a successful IPO). Dr. Barr had responsibilities for developing ID Analytics' identity score (ID Score), fraud rating (IM Score), and also subprime consumer credit risk scores (Credit Optics). Joe also served as CTO at Chi Analytics where he developed products for the collection and pay day loan industry.Joe's experience in the financial services sector spans more than two decades, including senior roles at ABN-AMRO/LaSalle Banks of Chicago, Citigoup, Wells Fargo, Fanny Mae and First American. Joe holds a PhD in mathematics from the University of New Mexico where he completed his thesis on heuristics in discrete optimization.

Amit Chakraborty, Applied Data Technologies

Title: Image Processing and Image Pattern Recognition - A Programming Tutorial

 

Abstract: Image recognition is a major area of application of machine learning - evolving at a rapid pace with a number of programming platforms available to developers. While each platform has its own uniqueness, the methodology of image recognition consists of a sequence of image processing tasks, development of a classifier algorithm, training and testing followed by deployment. This tutorial will delve into the programming aspects of image processing including thresholding, contouring and template matching. In order to provide practical hands on programming this tutorial will closely look at three real life applications of image pattern recognition namely ALPR  using Tesseract OCR, vehicle detection using HAAR classifiers and human detection using HOG classifiers. The tutorial will explain the algorithm, implementation of pseudocode through programming languages in Python, Java and C++ using two major platforms: OpenCV and Tensorflow.

 

This tutorial is aimed at students and professionals with experience in software development and those who would wish to develop competence in image pattern recognition. Many data scientists who are skilled in algorithm development would also find this tutorial of interest. At the end of this tutorial the participants can expect to take away a good understanding of how they can programmatically implement practical applications of image recognition.

 

Bio: Amit Chakraborty is the Chief Solution Architect at Applied Data Tech Inc. He is an experienced technology leader with consistent experience in development of technology products and driving market adoption in machine learning and analytics, cloud and mobile domains.

 

At Applied Data Tech he has been instrumental in developing VisionBot – a cloud based multi tenanted image recognition platform. Prior to founding Applied Data Tech he was CTO of EnableM Technologies where he led the building up of an award winning cloud based adaptive learning platform. Earlier to that he worked as Development Director in Location based Analytics for Rolta Ltd. Prior to that he worked as Engineering Manager for Nokia/ Symbian and in various capacities with Motorola and Openwave. Amit holds a B.Tech degree in Electrical Engineering from Indian Institute of Technology.

 


David A. Ostrowski,  Ford Motor Company, USA
Title: Artificial Intelligence with Big Data

 

Abstract:  Big Data has presented new opportunities for the Artificial Intelligence practitioner. Unlike traditional application development, Big Data application development presents its own unique challenges in order to appropriately harness the utility of open source frameworks including Apache Hadoop and design patterns such as Map – Reduce. Through these advances, Big Data has become ubiquitous across all areas of research allowing for new applications that were not possible earlier.

 

The tutorial will specifically step each student though the installation, setup and implementation of the Apache Spark architecture on Amazon Web Services. The tutorial will also present a detailed examination of the Spark Distributed Processing platform with examples leveraging the Scala programming language. Machine Learning techniques will also be covered utilizing the ML library on the Scala/ Spark platform. Additional technologies including Python Tensorflow and Scikit-learn are also going to be presented as comparative technologies. Overall architecture design will also be explored as well as the future direction of the Big Data software suite. 

Bio: David Ostrowski works in the Global Data Insight and Analytics organization as a Senior Analytics Scientist concentrating in the area of Product Analytics. Dr. Ostrowski holds a Doctorate in Computer Science from Wayne State University and has over 30 years of industry experience in software development within many areas including real-time data acquisition and analytics. He has 20 years of experience teaching CIS curriculums at the undergraduate and graduate levels. Dr. Ostrowski also has over 40 refereed publications, two book chapters, numerous technical reports, and participates within several technical committees including IEEE TEC transactions, IEEE ICSC and IEEE ICIOS.

 

Vagelis Hristidis, University of California, Riverside, USA
Title: Chatbot Technologies and Challenges

 

Abstract: Chatbots have recently become popular due to the widespread use of messaging services and the advancement of Natural Language Understanding. In this tutorial, we give an overview of the technologies that drive chatbots, including Information Extraction and Deep Learning.  We also discuss the differences between conversational and transactional chatbots – the former are trained on free-form chat logs, whereas the latter are defined manually to achieve a specific goal like booking a flight. We also provide an overview of commercial tools and platforms that can help in creating and deploying chatbots. Finally, we present the limitations and future work challenges in this area.

Bio: Vagelis Hristidis is a Professor at the Computer Science and Engineering Dept. at UC Riverside. He received his PhD at UC San Diego. Hristidis is an expert in the management and querying of semistructured and text data, with an emphasis on social network and health data. His work on searching semistructured data has received more than 6,000 citations according to Google Scholar. His key achievements include the NSF CAREER award, a Google Research Award, an IBM Scalable Data Analytics for A Smarter Planet Innovation Award, the FIU SCIS Excellence in Research Award (twice), the FIU University Faculty Award and the Kauffmann Entrepreneurship Award. His work on Twitter analytics was covered by Forbes, The Washington Post, The NY Times, Yahoo!, The Times, The Telegraph, The Independent, Daily Mail (UK), and others. He also received the best paper award in ACM CIKM 2010. He is the co-Founder of SmartBot360, a platform for the collaboration between humans and chatbots.


Peter Shaw, Massey University, New Zealand
Title: Combinatorial Algorithms in Machine Learning

 

Abstract: Although quite old, the classic data clustering problem strives to segment the data into homogeneous groupings where homogeneity is measured by e.g., Gini Index.   Classical techniques, all heuristic, strive to group the data by what one would argue as "smart" trial-and-error procedure. I will show how data could be clustered used an entirely combinatorial techniques where of Gini Index or Mean Squared Error receive no mention whatsoever. The Cluster Editing pattern search algorithm shows a great promise to help solve those intractable high-dimensional problems because it's almost total indifferent to the dimensionality of the data. 

Bio: Dr. Peter Shaw is a senior lecturer at Massey University New Zealand and teachers at Hebut University of Technology Tianjin China. His research is in the area of efficient Fixed-Parameter Algorithms. He also works on the analysis of clinical data and employs these algorithms to do pattern search on multivariate models that I produce from the data using machine learning techniques. I currently am looking at data sets related to Otitis Media (deafness), ALRI (acute lung infections),  human trafficking, and transportation.