Day 1 | May 28, 2019
8:00 – 9:15 AM | Registration and Breakfast
9:15 – 9:30 AM | Opening Remarks and Welcome
- Jim Ghadbane, President and CEO | CANARIE
- Scott Henwood, Director, Research Software | CANARIE
9:30 – 10:30 AM | Keynote: Deep Reinforcement Learning in Research
- Dr. Pablo Castro, Research Software Developer | Google AI
Deep reinforcement learning (deep RL) is an area of machine learning that has grown significantly in recent years, aiming to develop autonomous agents to make optimal decisions for a particular task. Although a number of software offerings now exist that provide stable, comprehensive implementations for different RL methods, recent deep RL research has become more diverse in its goals. At Google AI, we built Dopamine: a new research framework for deep RL that focuses on an important area of deep RL research. Our aim was to provide a framework that is compact and easy to understand, enabling fast prototyping of new ideas. Dopamine is open-source, TensorFlow-based, and provides compact and reliable implementations of some state-of-the-art deep RL agents.
10:30 – 11:00 AM | Networking Break
11:00 – 11:20 AM | Short Talk: Implementing a Research Software Development Team at McMaster University: Early Lessons Learned
- Ranil Sonnadara, Associate Professor & Special Advisor to the Vice-President, Research | McMaster University
11:20 – 11:40 AM | Short Talk: GUIs are Luxury Items in the Science World
- David Huard, Specialist, Climate Scenarios and Services | Ouranos
Anecdotal evidence suggests that GUIs are often shunned by scientists, who prefer programmatic access to platform services. Indeed, scripting provides a level of transparency and reproducibility that few GUIs can offer. Why then do research platforms feel the need to develop a GUI, when it can drain 30-50% of the project’s funds? Is the real purpose of GUIs to wow funders and upper management? Is it worth the opportunity costs for scientific users?
11:40 AM – 12:00 PM | Short Talk: Docker Swarm: Myths and Realities of Deploying Complex Environments for Scientific Software
- Anton Zakharov, Research Software Developer | CRIM
When it’s time to deploy a project into production, relying solely on Docker or Docker-Compose tools requires time consuming and error prone procedures: imagine you need to perform horizontal scaling for some services, in a stack that contains dozens of services, and make sure that everything stays connected. During this talk, we’ll share our approach and obstacles while migrating our Docker-Compose to a Swarm environment: persistent data management, networking issues, resource reservation.
12:00 – 1:00 PM | Lunch and Networking
1:00 – 1:20 PM | Short Talk: Utilizing High-Throughput Functional Genomics Databases
- Dr. Emma Bell, Princess Margaret Cancer Centre | UHN
High-throughput genomics experiments cost time, money, and effort. Public repositories that store, organize, and distribute this data present an ever-growing treasure trove of information. However, their sheer size means these repositories are incomplete and riddled with poor quality and incoherently annotated data sets. This talk aims to provide fellow users catharsis and practical advice on how to avoid common pitfalls. Those outside of genomics will take away cautionary tales from a young field.
1:20 – 1:40 PM | Short Talk: Data Entry: Boon and Bane
- Morgan Taschuk, Senior Manager | Ontario Institute for Cancer Research
Data entry makes users and developers shudder alike, yet it is absolutely critical to modern systems. Most software is not designed with the user experience in mind. Our goal in developing MISO LIMS was to make data entry not suck and in this talk, we describe perspectives and lessons learned.
1:40 – 2:00 PM | Short Talk: iReceptorPlus – Small Project to International Research Software Engineering – Part II
- Brian Corrie, Technical Director, iReceptor | Simon Fraser University
Last year, I talked about the iReceptor Platform and how it had emerged from a small CANARIE NEP project to being part of a collaboration around international standards. Since then, the iReceptor team and many of those collaborators have been awarded a joint EU Horizon 2020/CIHR grant, with the iReceptor Platform at its foundation. The project involves 20 partners and 9 countries. In this talk I will discuss our plans for managing this emerging, international research software engineering project.
2:00 – 2:05 PM | Lightning Talk: Jenkins Configuration as Code
- Long Vu, Software Developer | Ouranos
Configuring a Jenkins continuous build/integration server has typically been a manual task of clicking through the Web UI. This means it is hard to keep multiple existing Jenkins instances (Production/Staging) in sync configuration-wise, and it is hard to start a fresh new Jenkins instance fully configured from the get-go. This talk will demonstrate the new Jenkins Configuration as Code plugin, allowing you to start up a fully configured Jenkins. See: https://github.com/Ouranosinc/jenkins-config
2:05 – 2:10 PM | Lightning Talk: vtree: An R Package for Calculating and Drawing Variable Trees
- Nick Barrowman, Statistician | Children’s Hospital of Eastern Ontario Research Institute
For many scientific data sets, questions about nested subsets arise, but the calculation and visualization of subsets can be time-consuming and error-prone. As the number of nested subsets increases, the magnitude of the task grows rapidly. And if there are missing values in the data set, the task becomes even more complicated. A tree structure provides a natural way to represent nested subsets of a data set. The vtree R package is a flexible tool for calculating and drawing “variable trees”.
2:10 – 2:15 PM | Lightning Talk: Going from Sue Doe to John Doe
- Dr. Karim Bouayad-Gervais, Pillar Science
Research software enables researchers to reach new heights. However, for researchers with limited technical skills, these software innovations remain mostly inaccessible. These researchers still have to rely on easy-to-use yet inefficient and error-prone technologies. This is even more important given the increased pressure for academics to remain competitive. We will discuss the importance of designing accessible research software by presenting the case of a research group in life science.
2:15 – 2:20 PM | Lightning Talk: IRIDA: Integrated Rapid Infectious Disease Analysis Platform
- Dan Fornika, Genomics Specialist | BC Centre for Disease Control Public Health Lab
The Integrated Rapid Infectious Disease Analysis (IRIDA) platform facilitates genomics-based public health investigations. The platform offers project management, analysis workflow and data sharing facilities with an intuitive web-based user interface. IRIDA supports high-throughput routine analysis with a set of curated and optimized workflows. In addition, a new plugin-based pipeline system allows developers to add additional customized pipelines that are relevant to them.
2:20 – 2:25 PM | Lightning Talk: Software Tools for Visualization and Analysis of Networks
- Max Franz, Senior Software Engineer | University of Toronto
Understanding disease often involves big data in the form of networks. To leverage this data, we require software for efficient data storage, analysis, and intuitive user-facing tools.
We developed a suite of foundational technologies for semantic structuring, storage, retrieval, and interactive network UIs that enable integration and sharing. On that foundation, we are growing our ecosystem of network apps and building a centralized web platform to expand its reach and accessibility.
2:25 – 2:30 PM | Lightning Talk: naturecounts: A New R Package to Access Standardized Data on Bird Populations
- Dr. Stefanie LaZerte, Bird Studies Canada
naturecounts is an online repository of plant and animal population data managed by Bird Studies Canada. It has over 128 million records from nearly 400 different sources generated by volunteer surveys and research projects across Canada. Recently, we have developed an open-source R software package, “naturecounts”, to facilitate access to the NatureCounts repository via a new public API. This supports Bird Studies Canada’s goal of providing data access for research, conservation, and education.
2:30 – 3:00 PM | Networking Break
3:00 – 3:05 PM | Lightning Talk: Experimental Paradigms in Human Movement Science: Control of Complex Dynamics Using Sonification and Visualization
- Dobromir Dotov, Research and High-Performance Computing Services (RHPCS) | McMaster University
Motor skill is better described as dynamic or physical intelligence, not computational intelligence. In this sense, the experimental paradigms in disciplines such as motor (re)training and skill acquisition emphasize the fidelity of real-time feedback and the relevance of the artificial dynamic task space for one’s capacities. For the sake of fidelity and transfer to real-world application, these paradigms have to rely on affordable hardware. For the sake of relevance, they need to instantiate the control of objects that have complex or underactuated dynamics. We present cases from our work with very low-cost hardware where visualization or sonification of movement was used to interact with simulated complex dynamics. What is needed in the future is to increase the dimensionality and complexity of the task spaces to create more choices for researchers. Furthermore, advances in computational and statistical methods could help the identification of dynamic invariants in recorded movement data.
3:05 – 3:10 PM | Lightning Talk: The BESOS Platform
- Paul Kovacs, Programmer Analyst | University of Victoria
Buildings and energy systems play a crucial role in mitigating climate change, one of the greatest threats humanity faces. These systems span from reducing and shifting energy demands to more efficient delivery and supply of energy. The Building and Energy Simulation, Optimization and Surrogate-Modelling (BESOS) platform will provide a suite of modules for the simulation and optimization of buildings and urban energy systems.
3:10 – 3:20 PM | Lightning Talk: Software Integration in the Digital Humanities
- Susan Brown, Professor/Project Lead | University of Guelph – Canadian Writing Research Collaboratory
- Mihaela Ilovan, Project Manager | University of Guelph – Canadian Writing Research Collaboratory
The Canadian Writing Research Collaboratory (CWRC) online Digital Humanities platform integrates research data storage with tools for research, discovery, and dissemination, many sourced from external development partners. CWRC will present three brief case studies of software integration (NERVE named entity recognition tool, DToC e-reader, and HuViz LOD visualizer) and describe the linked open data approach adopted to support collaboration and the interoperability of curated research data.
3:20 – 3:25 PM | Lightning Talk: EEG Pipelines Integration into the CBRAIN Platform Using the Boutiques Command Line Framework
- Dr. Sergiy Boroday, McGill Centre for Integrative Neuroscience | McGill University
- Dr. Obaï Bin Ka’b Ali | Concordia University
CBRAIN is a web-based grid portal allowing collaborative large-scale data processing, originally for neuroimaging. To promote research involving electro- and magneto- encephalogram data, we added two pipelines, a source localization toolbox, BEst (Brain Entropy in Space and Time) and qEEG. Pipeline descriptors in Boutiques standard are available online and can be used by developers for execution on computing platforms. In this talk, we focus on pipeline integration process.
3:25 – 3:30 PM | Lightning Talk: Data Augmentation – How to Get the Best Possible Outcome from Big Data / AI Analytics
- Dominic Lam, Director, Business & Solution Development | Datadex Inc.
Sophisticated Big Data and AI analytics require data with statistically significant sample size and rich enough features that best characterize a complex system under investigation to deliver their best possible outcome. Data Augmentation identifies relevant data and merges them to help achieve that objective. Datadex performs data search, indexing, modelling API, linking, exchange and merging to deliver Data Augmentation capability. Datadex is available to researchers at no charge.
3:30 – 3:35 PM | Lightning Talk: Streamlining the Inclusion of Computer Experiments In a Research Paper
- Sylvain Hallé, Professor | Université du Québec à Chicoutimi
To run experiments on a computer, you probably write command-line scripts for various tasks: generate your data, save it into files, process and display them as plots or tables to include in a paper. But soon enough, your handful of “quick and dirty” files becomes a bunch of poorly documented scripts that generate and pass around all kinds of obscure temporary files. LabPal is a library that allows you to set up an environment for running experiments, collating their results and processing them.
3:35 – 3:55 PM | Short Talk: Challenges and Opportunities of Adding Non-Standard Data to an Existing Repository: Driving Square Pegs into Round Holes
- Doug Mulholland, Technical Manager | University of Waterloo, Computer Systems Group
Vast amounts of disparate data are being generated in every field of study and stored in custom-built repositories with a limited ability to accommodate non-standard data. “Unconventional” data is often subsequently discovered that should be added. A core aspect of the iEnvironment project is to extend existing systems to handle non-standard data. The loss of legacy data due to approaching retirements or contrarian views of science such as climate change will also be discussed.
3:55 – 4:15 PM | Short Talk: Best Practices in Software Development
- Henriette Koning, Director IT PMO | Stemcell Technologies Inc
Drawing on lessons learned from small software development teams delivering new technologies such as machine learning algorithms, Henriette will share best practices and lessons learned on how to structure your software development project, how to use an Agile approach to handle uncertainty and innovation, and how you can set up your team for success.
5:00 – 7:00 PM | Reception
Day 2 | May 29, 2019
8:15 – 9:15 AM | Registration and Breakfast
9:15 – 10:15 AM | Keynote: Making Effective, Useful Software Development Tools
- Dr. Gail Murphy, Professor (Computer Science) & Vice-President Research & Innovation | University of British Columbia
Software systems are sometimes referred to as the most complex artifacts ever engineered. To help create these systems, software engineers use many tools. The vast majority of these tools have been designed from an artifact-centric perspective; for instance, a compiler takes one representation of a program and changes it into another representation. In this talk, I will argue that a human-centric approach to designing software development tools is essential to accelerate our ability to build complex software systems with desired qualities. A human-centric approach involves a focus on how humans work with computational structures and with each other. By taking a human-centric approach, we can improve our ability to produce software development tools that are effective and useful for humans. It is time we ensure that the tools we build work for humans instead of humans working for the tools.
10:15 – 10:45 AM | Networking Break
10:45 – 11:45 AM | Keynote: Why is Building Robust and Reliable Robotics Software So Hard?
- Dr. Jonathan Kelly, Director, Space & Terrestrial Autonomous Robotic Systems (STARS) Laboratory and Professor, Institute for Aerospace Studies (UTIAS) | University of Toronto
Building robust, reliable, and safe robotic systems that operate in controlled environments is hard. As robots move out of factories and laboratories and into the (human-centric) world, the problem is getting harder. Although the durability and resilience of robot mechanical and hardware components plays a role, the primary difficulty lies in creating sufficiently capable software. Robots that operate in and around people must be extremely safe and trustworthy, or they will simply not be accepted, that is, nobody will buy them or use them. In this talk, I will give an overview of the challenges involved in designing software for human-centric autonomy, including the need to understand highly dynamic environments, to handle uncertain human actions, and to manage complex corner cases. I will close the talk by offering some suggestions on ways that software developers can address these issues, helping to bring robots into our everyday lives.
11:45 AM – 1:00 PM | Lunch and Networking
1:00 – 1:20 PM | Short Talk: Using Ontologies to Standardize Data Sharing
- Damion Dooley, Scientific Programmer | University of British Columbia
Sharing data across agencies and platforms often consists of mapping across in-house and semi-standard data dictionaries, yielding costly and fragile solutions. Stepping beyond an ISO-styled approach, I will introduce the benefits of bringing formal, logically consistent controlled vocabularies into databases and software components. Biomedical and other domain ontologies are gaining momentum as open source de facto standards, advancing us towards a future of plug-and play standardized data.
1:20 – 1:40 PM | Short Talk: Geodisy: Developing a Discovery Layer for Geospatial Research Data in Canada
- Paul Dante, Software Engineer | University of British Columbia
- Mark Goodwin, Metadata Coordinator | University of British Columbia
With increasing demand for geographic components in research, there is an opportunity for research data repositories to provide alternatives to text-based searching. The goal of the CANARIE-funded Geodisy project is to create an extensible open-source software method to discover Canadian geospatial research data using a map interface. In this session, we will share software architecture designs and our progress toward normalizing various metadata standards into discoverable geospatial metadata.
1:40 – 2:00 PM | Short Talk: ORCID-CA and the Infrastructure of Open Science in Canada
- Jeffrey Demaine, ORCID-CA Community Manager | Canadian Research Knowledge Network
ORCID provides researchers with a unique identifier to associate with their research contributions and affiliations. Beyond its usefulness for individuals, ORCID is used by research institutions at a system level to ensure data quality and to streamline workflows. The ORCID-CA consortium provides universities and research institutions across Canada with membership in this trusted network, allowing them to connect their research-management systems with their researcher’s outputs. This ability to link to the ORCID database via an API greatly simplifies the administration of research both at the level of the individual as well as the institution. Identifiers such as ORCID have an important role to play in enabling Open Science, as datasets and publications become interoperable once their metadata is linked.
2:00 – 2:10 PM | Stretch Break
2:10 – 2:30 PM | Short Talk: Radiam: Helping Researchers Keep Track of Data
- Todd Trann, Senior Software Developer | University of Saskatchewan
Where is your research data, how is it organized, and how can you find what you need with many people working on the same project? We’re working on helping researchers tackle these questions.
The University of Saskatchewan and Simon Fraser University, through funding from CANARIE, are collaborating to develop a set of scalable software components called Radiam to fill functional gaps identified in existing tools and services for the management of active research data.
2:30 – 2:50 PM | Short Talk: Pore-Scale Analysis of Electrochemical Devices
- Jeff Gostick, Professor | University of Waterloo
Electrochemical energy storage, e.g. lithium-ion batteries or redox-flow cells, will play a key role in a renewable energy future by bridging the mismatch between generation and consumption of electricity. These devices all share a common need for highly sophisticated porous electrode structures to support the reactions. Advances in 3D imaging techniques have opened a new window into these materials, via pore-scale simulations directly on images. Progress and challenges will be presented.
2:50 – 2:55 PM | Survey and Poster Awards
- Dan Sellars, Manager, Software Development | CANARIE
- Scott Henwood, Director, Research Software | CANARIE
2:55– 3:00 PM | Closing Remarks
- Mark Wolff, Chief Technology Officer | CANARIE