Daniel S. Katz (University of Illinois): Parsl: Pervasive Parallel Programming in Python

Please note non-standard date and time for this talk

Abstract: High-level programming languages such as Python are increasingly used to provide intuitive interfaces to libraries written in lower-level languages and for assembling applications from various components. This migration towards orchestration rather than implementation, coupled with the growing need for parallel computing (e.g., due to big data and the end of Moore’s law), necessitates rethinking how parallelism is expressed in programs.

Here, we present Parsl, a parallel scripting library that augments Python with simple, scalable, and flexible constructs for encoding parallelism. These constructs allow Parsl to construct a dynamic dependency graph of components from a Python program enhanced with a small number of decorators that define the components to be executed asynchronously and in parallel, and then execute it efficiently on one or many processors. Parsl is designed for scalability, with an extensible set of executors tailored to different use cases, such as low-latency, high-throughput, or extreme-scale execution. We show, via experiments on the Blue Waters supercomputer, that Parsl executors can allow Python scripts to execute components with as little as 5 ms of overhead, scale to more than 250000 workers across more than 8000 nodes, and process upward of 1200 tasks per second.

Other Parsl features simplify the construction and execution of composite programs by supporting elastic provisioning and scaling of infrastructure, fault-tolerant execution, and integrated wide-area data management. We show that these capabilities satisfy the needs of many-task, interactive, online, and machine learning applications in fields such as biology, cosmology, and materials science.

Slides: see here.

Speaker Bio: Daniel S. Katz is Assistant Director for Scientific Software and Applications at the National Center for Supercomputing Applications (NCSA), and Research Associate Professor in Computer Science; Electrical & Computer Engineering; and the School of Information Sciences at the University of Illinois Urbana-Champaign. For further details, please see his website here.

Event details

  • When: 18th October 2019 13:00 - 14:00
  • Where: Cole 1.33b
  • Series: School Seminar Series
  • Format: Seminar

Code4REF: Recording software outputs in Pure

Do you develop research software?  If so, you may be interested in the Code4REF project, which explains how to record it in Pure – the research information system used in St Andrews. Research software is a primary research output, and it should get the same visibility as research publications on the University research portal. You can find all current software entries in the Research Portal here, but the picture is certainly incomplete – we know many more researchers who write code. We call everyone to join efforts and help us to collect further evidence that software is vital for research!

If you have any comments about the Code4REF project, please create an issue in its GitHub repository.

Software Carpentry Workshop

Registration is open for the next Software Carpentry workshop in St Andrews on September 23-24 in the Parliament Hall. We will teach UNIX shell, version control with Git and programming with Python. Please see the workshop page for further details and the link to registration via PDMS.

Event details

  • When: 23rd September 2019 - 24th September 2019
  • Where: Parliament Hall
  • Format: Workshop

Donald Robertson awarded Brendan Murphy Prize at MSN/Cosener’s 2019!

Each year in July, the (broadly-defined) computer networking community converges at Cosener’s House for the MSN workshop. The workshop is an informal gathering where attendees – students in particular – are encouraged to present on-going work and/or crazy ideas. From among the  presentations, the Brendan Murphy Award is given to the best student presentation, generally for work that has yet to be scrutinized or peer-reviewed.

Congratulations to Donald Robertson who, this year, has brought that honour to St Andrews as co-recipient of the award (alongside Naomi Arnold from QMUL).

http://coseners.net/history/brendan-murphy-prize/

(In the interest of transparency, Marwan Fayed was on the judging panel but recused himself during discussion of Donald’s presentation.)

The Melville Trust for the Care and Cure of Cancer PhD award

The Melville Trust for the Care and Cure of Cancer have funded a PGR Studentship relative to the project entitled ‘Detecting high-risk smokers in Primary Care Electronic Health Records: An automatic classification, data extraction and predictive modelling approach’.

The supervisors are Prof. Frank Sullivan of the School of Medicine and Prof. Tom Kelsey of the School of Computer Science, with work commencing in September 2019. The award is for £83,875.

MIP Modelling Made Manageable

Can a user write a good MIP model without understanding linearization? Modelling languages such as AMPL and AIMMS are being extended to support more features, with the goal of making MIP modelling easier. A big step is the incorporation of predicates, such a “cycle” which encapsulate MIP sub-models. This talk explores the impact of such predicates in the MiniZinc modelling language when it is used as a MIP front-end. It reports on the performance of the resulting models, and the features of MiniZinc that make this possible.

Professor Mark Wallace is Professor of Data Science & AI at Monash University, Australia. We gratefully acknowledge support from a SICSA Distinguished Visiting Fellowship which helped finance his visit.

Professor Wallace graduated from Oxford University in Mathematics and Philosophy. He worked for the UK computer company ICL for 21 years while completing a Masters degree in Artificial Intelligence at the University of London and a PhD sponsored by ICL at Southampton University. For his PhD, Professor Wallace designed a natural language processing system which ICL turned into a product. He moved to Imperial College in 2002, taking a Chair at Monash University in 2004.

His research interests span different techniques and algorithms for optimisation and their integration and application to solving complex resource planning and scheduling problems. He was a co-founder of the hybrid algorithms research area and is a leader in the research areas of Constraint Programming (CP) and hybrid techniques (CPAIOR). The outcomes of his research in these areas include practical applications in transport optimisation.

He is passionate about modelling and optimisation and the benefits they bring.  His focus both in industry and University has been on application-driven research and development, where industry funding is essential both to ensure research impact and to support sufficient research effort to build software systems that are robust enough for application developers to use.

He led the team that developed the ECLiPSe constraint programming platform, which was bought by Cisco Systems in 2004. Moving to Australia, he worked on a novel hybrid optimisation software platform called G12, and founded the company Opturion to commercialise it.  He also established the Monash-CTI Centre for optimisation in travel, transport and logistics.   He has developed solutions for major companies such as BA, RAC, CFA, and Qantas.  He is currently involved in the Alertness CRC, plant design for Woodside planning, optimisation for Melbourne Water, and work allocation for the Alfred hospital.

Event details

  • When: 19th June 2019 11:00 - 12:00
  • Where: Cole 1.33a
  • Series: AI Seminar Series
  • Format: Lecture, Seminar

St Andrews Bioinformatics Workshop 10/06/19

Next Monday is the annual St Andrews Bioinformatics workshop in Seminar Room 1, School of Medicine. Some of the presentations are very relevant to Computer Science, and all should be interesting. More information below:

Agenda:

14:00  – 14:15: Valeria MontanoThe PreNeolithic evolutionary history of human genetic resistance to Plasmodium falciparum

14:15 – 14:30: Chloe Hequet: Estimation of Polygenic Risk with Machine Learning

14:30 – 14:45: Roopam Gupta: Label-free optical hemogram of granulocytes enhanced by artificial neural networks

15:00 – 15:15: Damilola Oresegun: Nanopore: Research; then, now and the future

15:15 – 15:30: Xiao Zhang: Functional and population genomics of extremely rapid evolution in Hawaiian crickets

15:30 – 16:00: Networking with refreshments

16:00 – 17:00: Chris Ponting: The power of One: Single variants, single factors, single cells

You can register your interest in attending here.

Event details

  • When: 10th June 2019 14:00 - 17:00
  • Format: Lecture, Talk, Workshop

Professor Aaron Quigley new SICSA Director

Congratulations to Professor Aaron Quigley who has been appointed as the new Director of SICSA. Aaron, the Chair of Human Computer Interaction co-founded SACHI, the St Andrews Computer Human Interaction research group and served as its director from 2011-2018.

In his volunteer roles he is the ACM SIGCHI Vice President for Conferences (on the ACM SIGCHI Executive Committee), member of the ACM Europe Council Conferences Working Group, a board member of ScotlandIS and an ACM Distinguished Speaker. Aaron will be general co-chair for the ACM CHI conference in Asia in 2021.

For more information about Professor Quigley, please see https://aaronquigley.org.

Juho Rousu: Predicting Drug Interactions with Kernel Methods

Title:
Predicting Drug Interactions with Kernel Methods

Abstract:
Many real world prediction problems can be formulated as pairwise learning problems, in which one is interested in making predictions for pairs of objects, e.g. drugs and their targets. Kernel-based approaches have emerged as powerful tools for solving problems of that kind, and especially multiple kernel learning (MKL) offers promising benefits as it enables integrating various types of complex biomedical information sources in the form of kernels, along with learning their importance for the prediction task. However, the immense size of pairwise kernel spaces remains a major bottleneck, making the existing MKL algorithms computationally infeasible even for small number of input pairs. We introduce pairwiseMKL, the first method for time- and memory-efficient learning with multiple pairwise kernels. pairwiseMKL first determines the mixture weights of the input pairwise kernels, and then learns the pairwise prediction function. Both steps are performed efficiently without explicit computation of the massive pairwise matrices, therefore making the method applicable to solving large pairwise learning problems. We demonstrate the performance of pairwiseMKL in two related tasks of quantitative drug bioactivity prediction using up to 167 995 bioactivity measurements and 3120 pairwise kernels: (i) prediction of anticancer efficacy of drug compounds across a large panel of cancer cell lines; and (ii) prediction of target profiles of anticancer compounds across their kinome-wide target spaces. We show that pairwiseMKL provides accurate predictions using sparse solutions in terms of selected kernels, and therefore it automatically identifies also data sources relevant for the prediction problem.

References:
Anna Cichonska, Tapio Pahikkala, Sandor Szedmak, Heli Julkunen, Antti Airola, Markus Heinonen, Tero Aittokallio, Juho Rousu; Learning with multiple pairwise kernels for drug bioactivity prediction, Bioinformatics, Volume 34, Issue 13, 1 July 2018, Pages i509–i518, https://doi.org/10.1093/bioinformatics/bty277

Short Bio:
Juho Rousu is a Professor of Computer Science at Aalto University, Finland. Rousu obtained his PhD in 2001 form University of Helsinki, while working at VTT Technical Centre of Finland. In 2003-2005 he was a Marie Curie Fellow at Royal Holloway University of London. In 2005-2011 he held Lecturer and Professor positions at University of Helsinki, before moving to Aalto University in 2012 where he leads a research group on Kernel Methods, Pattern Analysis and Computational Metabolomics (KEPACO). Rousu’s main research interest is in learning with multiple and structured targets, multiple views and ensembles, with methodological emphasis in regularised learning, kernels and sparsity, as well as efficient convex/non-convex optimisation methods. His applications of interest include metabolomics, biomedicine, pharmacology and synthetic biology.

Event details

  • When: 30th April 2019 14:00 - 15:00
  • Where: Cole 1.33a
  • Format: Seminar