THE SEVEN DEADLY SINS OF SCIENCE GATEWAYS INITIATIVES

    1. They act as a Grid “cache-misère”. They maintain the illusion that grids can be “fixed” or “recycled” or made “more appealing”. They delay a necessary moratorium on a costly and obsolete technology and paradigm whose ineluctable death became today as obvious as the sun.

    2. They assume that the only way to interact with a federated infrastructure is a job scheduler of some kind. By providing a “federation layer” to e-Infrastructure, they make everything look like a grid. Such a choice compromises the interaction design that could be envisaged at the user facing layer. In particular, interactive computing and real-time collaboration are not any more possible. The grid mentality should die. Interactive computing (the IPython way) should receive more focus.

    3. They envisage the infrastructure with a pre-cloud mind-set. Before elasticity, the most compelling feature of clouds is scriptability: few lines of code can describe and bring to life a complex hardware/software architecture, the back-end for computation can and should be built on-the-fly, on real-time, based on libraries of infrastructure-describing scripts. Everything should be targeting Infrastructure-as-a-Service-style clouds and make use of their full potential.

    4. They consider Graphic User Interfaces as just software that can be built by developers and researchers. The challenge of building usable man-machine interfaces requires expertise and should be done by people whose job is to design interaction. Usability is hard, it doesn’t just happen. Systematic involvement of interaction designers is key.

    5. They overlook the fact that building sustainable engineering artifacts is different from research and that the structures and frameworks that work for research projects may not be effective in building and delivering infrastructures and tools for science. They keep reinventing the wheel and proposing yet another “middleware”. They build software in conditions and with processes that do not enable to build high quality software. They reproduce again and again the “death march” (E. Yourdon) towards software doomed to fail.They get overwhelmed by the technical complexity and forget that the survival of a software is a more daunting task than its design and building. Right after the software delivery and in the absence of an ecosystem, starts another “death march” towards obsolescence.They should either take the software ecosystem building challenge seriously. Involve in the project and in the strategic thinking experts in software design and software ecosystems. Consider the ecosystem to grow as the core objective. Potentially get the necessary guidance from a central European agency (to be invented) that would provide expertise and coaching. or get connected from day one to an existing ecosystem and shape the project’s outcome towards becoming an artifact valuable to an established community.

    6. They overlook the fact that if an application is based on a frozen set of requirements, it can’t be a tool for science where everything is moving, exploratory, transient by nature. Scientists love Matlab, R, Python, etc. because those tools allow them to progress towards understanding their data, building their models, comparing their results with others’: They follow a “Brownian motion” towards the unknown. R, Python, Matlab allow them to capture their non-predictable-in-advance trajectory towards a scientifically relevant/”publishable” result in the form of a “script”. That script can be shared and reused as is or in the form of a component/library/module/package that others can import in their own environments to reproduce their peers’ trajectory before envisaging to explore a new one of their own. Science Gateways and the workflow-paradigm they often rely on fail short in allowing such a “hyper agile”, traceable and reproducible scientific process. If science gateways should ever be useful to more than a handful of scientists, they have to comply with and empower this way of work, in particular: (a) No IT people should be involved in creating those science gateways, scientists should be able to build them and deploy them from the R, Python or Matlab command lines. Interaction components, views for data visualization, etc. should be scriptable and easy to combine with the tools scientists use to program with data. (b) Significant added value should come with the science gateways to convince the scientists to consider them. For instance enabling real-time collaboration (the Google-docs way) while accessing/analysing/visualising data would bring to the scientists’ desk capabilities they are currently eager to have. Also, adding social components that would allow them to engage with each other as small groups or communities would be valuable. Those scenarios are not any more science fiction thanks to the capabilities of cloud technologies and to the maturity reached by hundreds of open source tools, frameworks, computational libraries and infrastructure software.

    7. They lobby to give the science gateway/e-Infrastructure they build a fictitious appearance of popularity. The incentives “force lines” currently in operation create a bubble of fictitious use cases, imposed software and “non-organic” communities. Darwinism should rule to discard the “dancing bears” (a metaphor of software that hardly works for people, coined by A. cooper). Darwinism led to the long-lasting success of R, python, OpenStack, GitHub, ResearchGate, Hadoop, Spark, etc.



    Karim Chine

THE SEVEN DEADLY SINS OF SCIENCE GATEWAYS INITIATIVES