e-infrastructures underpin the digitisation of industry

Last 18 June 2015, commissioner Oettinger delivered a speech at the Digital assembly in Riga where he shared his strategic vision. In his speech, he fully supported the work of e-infrastructures and placed it at the centre of his policy agenda. His endorsement is very good news for the whole e-infrastructure community and we should build on it to deliver on our commitments. You can find the full text here and the recording here (relevant part as of minute 14.00).

These are, in my view, the most important messages for us:

To underpin the digitalisation of industry, innovation and excellence in science, we do need a strong ICT sector based on essential e-infrastructures such as pan-European research networks, data infrastructures and distributed computing. In coordination with the Member states, the Commission aims at supporting such e-Infrastructures.

Investing in state-of-the-art, open and interoperable platforms and innovation e-infrastructures is essential so that business can rely and use them to make products, processes or services ready for the digital age.

Available and easy-to-use High Performance Computing resources is a priority for the industry’s competitiveness – in particular for SMEs – and for better adaptation to market demands, especially in terms of innovation and fast renewal of their product and service offerings. This is one of the main objectives of Industry 4.0.

The European Commission is starting a process of involvement of stakeholders to explore the opportunities offered by the European Fund for Strategic Investment, where High Performance Computing is a key infrastructural component.

In order to improve their performance, increase competitiveness, acquire new technologies and know-how, large companies could more actively involve digital start-ups to anticipate potential disruption and enable innovation.

I propose[d] actions in four key areas to be elaborated with the help of Member States and industry:

First, we need to facilitate access to digital technologies for any industry, especially SMEs, wherever it is located in Europe and in any sector.

Second, leadership in platforms for digital industry. The objective is to ensure the availability of state-of-the-art open and interoperable platforms that any business can use to make its products, processes or services ready for the digital age.

Third, preparing our workforce to benefit from the digital transformation. There is a clear need for promoting digital skills at all levels, for re-skilling, and for lifelong learning across Europe and its regions. Of course, this is the competence of Member States, but given the dimension and urgency of the challenge, I believe we need a concerted effort to be able to progress more rapidly.

Fourth, smart regulation for smart industry. New digital business models are challenging existing regulatory systems worldwide, requiring a new way of policy-making. We need to adapt our regulatory and legal frameworks to digital innovations.

Share widely,

Augusto

Advertisements
e-infrastructures underpin the digitisation of industry

Focus on the core, analyze the market and understand the users

It seems everyone today is rushing to build yet another infrastructure, yet another Infrastructure as a Service (IaaS) cloud. We all think we are a snow flake more special than all the others but at some point, we need to take a good look around and come to terms with reality. We are not going to be able to build a better, more efficient IaaS infrastructure than AWS. We will not be able to compete in a market space where prices are on a race to the bottom.

We are not going to bend the trend of inevitable consolidation which represents the market clearly signaling that there is no need for many cloud providers. If a company like HP, with its significant financial resources waffles between becoming an IaaS or not, research infrastructures should definitely not employ tax payer money for yet another OpenStack deployment. There is no argument that can stand to justify such course of action. Period.
What makes an IaaS successful? Big upstart pockets, an exquisite implementation of DevOps, an ecosystem of people, organizations and companies developing cool new capabilities and services and yes, some technological innovation that will provide temporary (key word “temporary”) differentiation. If we look at research infrastructures, we hardly have experience in either of these key areas except maybe technological innovation however, even that is limited. The technological innovation needed here comes from experience with operating large infrastructures and not from putting together platforms that support specific scientific projects. This is not to mention that we bet on one cloud automation and then had to change course to where all the industry is going: OpenStack.

Let us step back from the coolness of building our own little cloud and think about our customer: the scientist, the researcher, the student, the innovator. These people and their projects are very similar to startups, what they need is a platform, not another IaaS. In today’s World where you can get modules for virtually every function you need and all you have to do is stitch them up into the service or functionality you need, who wants to deal with yet another IaaS? The research community would be much better served by a platform that leverages the many IaaS options available, that brokers between the researchers and the cloud providers out there who know the infrastructure business very well.

We already have a great asset, the backbone that can deliver access to these various resources. That asset is called GEANT. We already have a unique expertise, the knowhow of managing a federated environment. The European Research Space has the resources and experience to become a Cloud broker for all it constituents. Lets invest smartly in what our customers, the researchers, the academics and students need: Services, Tools, Applications. The alternative is a tax payer money black hole that cannot be justified by any of the fast arguments we are bound to hear: security, performance, customization. All of those are already covered and delivered at much less costs than could be built by the community from the ground up. Moreover, as a broker, as a large community, this platform can drive the IaaS providers and can influence them in the pursuit of specific capabilities and services.

The fastest growing market segments for SaaS adoption are the education and the health sectors. There are plenty of reasons why SaaS is chosen and not IaaS. The facts speak for themselves, let us focus on the more important challenges at higher level of the stack. Let us focus on the users of Scientific infrastructures and their true needs.

Focus on the core, analyze the market and understand the users

THE SEVEN DEADLY SINS OF SCIENCE GATEWAYS INITIATIVES

    1. They act as a Grid “cache-misère”. They maintain the illusion that grids can be “fixed” or “recycled” or made “more appealing”. They delay a necessary moratorium on a costly and obsolete technology and paradigm whose ineluctable death became today as obvious as the sun.

    2. They assume that the only way to interact with a federated infrastructure is a job scheduler of some kind. By providing a “federation layer” to e-Infrastructure, they make everything look like a grid. Such a choice compromises the interaction design that could be envisaged at the user facing layer. In particular, interactive computing and real-time collaboration are not any more possible. The grid mentality should die. Interactive computing (the IPython way) should receive more focus.

    3. They envisage the infrastructure with a pre-cloud mind-set. Before elasticity, the most compelling feature of clouds is scriptability: few lines of code can describe and bring to life a complex hardware/software architecture, the back-end for computation can and should be built on-the-fly, on real-time, based on libraries of infrastructure-describing scripts. Everything should be targeting Infrastructure-as-a-Service-style clouds and make use of their full potential.

    4. They consider Graphic User Interfaces as just software that can be built by developers and researchers. The challenge of building usable man-machine interfaces requires expertise and should be done by people whose job is to design interaction. Usability is hard, it doesn’t just happen. Systematic involvement of interaction designers is key.

    5. They overlook the fact that building sustainable engineering artifacts is different from research and that the structures and frameworks that work for research projects may not be effective in building and delivering infrastructures and tools for science. They keep reinventing the wheel and proposing yet another “middleware”. They build software in conditions and with processes that do not enable to build high quality software. They reproduce again and again the “death march” (E. Yourdon) towards software doomed to fail.They get overwhelmed by the technical complexity and forget that the survival of a software is a more daunting task than its design and building. Right after the software delivery and in the absence of an ecosystem, starts another “death march” towards obsolescence.They should either take the software ecosystem building challenge seriously. Involve in the project and in the strategic thinking experts in software design and software ecosystems. Consider the ecosystem to grow as the core objective. Potentially get the necessary guidance from a central European agency (to be invented) that would provide expertise and coaching. or get connected from day one to an existing ecosystem and shape the project’s outcome towards becoming an artifact valuable to an established community.

    6. They overlook the fact that if an application is based on a frozen set of requirements, it can’t be a tool for science where everything is moving, exploratory, transient by nature. Scientists love Matlab, R, Python, etc. because those tools allow them to progress towards understanding their data, building their models, comparing their results with others’: They follow a “Brownian motion” towards the unknown. R, Python, Matlab allow them to capture their non-predictable-in-advance trajectory towards a scientifically relevant/”publishable” result in the form of a “script”. That script can be shared and reused as is or in the form of a component/library/module/package that others can import in their own environments to reproduce their peers’ trajectory before envisaging to explore a new one of their own. Science Gateways and the workflow-paradigm they often rely on fail short in allowing such a “hyper agile”, traceable and reproducible scientific process. If science gateways should ever be useful to more than a handful of scientists, they have to comply with and empower this way of work, in particular: (a) No IT people should be involved in creating those science gateways, scientists should be able to build them and deploy them from the R, Python or Matlab command lines. Interaction components, views for data visualization, etc. should be scriptable and easy to combine with the tools scientists use to program with data. (b) Significant added value should come with the science gateways to convince the scientists to consider them. For instance enabling real-time collaboration (the Google-docs way) while accessing/analysing/visualising data would bring to the scientists’ desk capabilities they are currently eager to have. Also, adding social components that would allow them to engage with each other as small groups or communities would be valuable. Those scenarios are not any more science fiction thanks to the capabilities of cloud technologies and to the maturity reached by hundreds of open source tools, frameworks, computational libraries and infrastructure software.

    7. They lobby to give the science gateway/e-Infrastructure they build a fictitious appearance of popularity. The incentives “force lines” currently in operation create a bubble of fictitious use cases, imposed software and “non-organic” communities. Darwinism should rule to discard the “dancing bears” (a metaphor of software that hardly works for people, coined by A. cooper). Darwinism led to the long-lasting success of R, python, OpenStack, GitHub, ResearchGate, Hadoop, Spark, etc.



    Karim Chine

THE SEVEN DEADLY SINS OF SCIENCE GATEWAYS INITIATIVES

About the draft e-Infrastructure Workprogramme 2016-2017

Let me contribute to this blog by summarising the feelings and suggestions of a devoted e-Infrastructure activist with respect to the perspectives of this extremely important constituent of the ERA.

It is to be emphasised that this summary is an independent and, as much as possible, neutral, unbiased compilation of some crucial elements of a personal view on the complex picture characterising the past, present, and future of our e-Infrastructure facilities and services.

(The below text is not yet discussed and not agreed or disagreed by any forum of role-players in the related area and therefore it may be considered just as an individual contribution to the related debates. Its appearance in this blog is due to the kind invitation by Augusto Burgueno: after having received a detailed response from me to his interest he expressed by a personal e-mail in a comment I made on the very topic with respect to the EC input to the related agenda point at a recent e-IRG meeting, he suggested to upload the text onto his blog. Of course I’m pleased to share my thoughts with the readers. It is to be noted that the following text, apart from some minor corrections, is practically the same as that of my above mentioned response, although some refinements surely could make it easier readable and better understandable.)

The issues dealt with below are nowadays in the focal point of discussions about e-Infrastructure development and operation where the opportunities for the coming years are of key importance for the concerned user community: tens of millions of users in the area of research and innovation, as well as higher education.

The big questions (most of them closely related to e-Infrastructure missions, roles, functions, responsibilities, influences, impacts, stability, sustainability, collaborations, governance, innovation, share of coverage by service types, by user communities, by geographic regions, etc.) are impossible or at least difficult to answer by a simple, unique way. That’s the reason why they have been on the agenda for several years now. However, the discussions definitely reached an elevated intensity since Augusto came out with the summary of his observations, corollaries, and suggestions in his blog.

Investigating and discussing the above issues need considerable carefulness, good knowledge about the past and present situations, more or less clear vision about the future aims and goals, as well as wisdom in making any hard decisions, especially irreversible ones. Such an overall carefulness, experienced overview, clear forecasting and roadmapping, as well as wise decision making are surely well established today at the EC, the unquestionably outstanding role-player in determining the basic European directions and opportunities in the field of e-Infrastructures. As potential advisory bodies there there are the key e-Infrastructure organisations, the (mostly public) user communities, and the numerous committees, bodies and organised fora, all being available for the EC to collect integrated, deeply discussed, well established opinions, advices, and proposals from them. (Sometimes the number of such sources of suggestions and recommendations seems also a bit too high.)

At the e-IRG meeting I started my comments by emphasising that I was talking in my e-IRG member’s hat. (This is important because, on one hand, I’m a member of some other bodies and committees as well, and on the other hand my input to the discussion was to be considered just as the view of one single member and therefore not at all a well discussed, well established, common e-IRG view.)

This means that my below opinions are to be considered at this phase just as food for thinking but not as an agreed, widely accepted message (to the EC). However, of course I’m glad to exchange ideas about the related issues.

Concerning my e-IRG contribution, I tried to briefly tell there to the other members of the e-IRG my points about the following questions:

1. Europe has an outstanding e-Infrastructure for research and education (and for innovation). Development and operation of that e-Infrastructure has been a success story since the mid-80’s (or since the early 90’s, concerning also the EC involvement). The co-operation of the NRENs is one of the best examples of how the various cultures, how the different countries can work together, on the basis of subsidiarity and solidarity, by exploiting the joint best will and common highest expertise, in order to provide the European user communities (R&E&I) with services globally acknowledged as leading edge. Networking (as the basic component of the e-Infrastructures) is in the best position with its 20-30 years of history but the more recent e-Infrastructure components are also on their way of finding the optimum directions of development and operation (all built on the network-enabled remote accessibility of the various resources and services in processing and storing the scientific information for, among others, supercomputing, grid and cloud computing, virtual facilitating, and data manipulating purposes). There is a good and well developing, but sensitive balance and share of coverage between the e-Infrastructure operators as service providers in the complex arena.

2. The pleasing status and the unquestionable sensitivity of the role-players in the peculiar European casting is the major reason why special carefulness, knowledge, vision, and wisdom are needed if any considerable intervention is turning to be on the agenda. The well working, proven model of the NREN co-operation is probably to be extended, copied, exploited, by due refinements if necessary. This model primarily is based on democracy, self-regulation, self-governance, and independence. And this model is based on handling and exploiting complexity – both in the sense of functional coverage and of covering the user communities. Let just a few important risks stemming from carelessly disturbing the established stability of the present status be mentioned here:

– Development and operation can’t be separated but should be kept closely connected in order to avoid alienation, fragmentation of services, counter-interests, loss of responsibilities. (The EC requested a few years ago to involve JRA, SA, NA types of activities in the GÉANT projects, and that proved to be a good idea.)

–  Networking and the novel e-Infrastructure functions mustn’t be separated but should be kept integrated as components of as much as possible complete NREN portfolios of services in order to maintain complex knowledge and to avoid loss of integral expertise on behalf of the developer-operator and also to avoid loss of one-stop-shopping opportunities for the users.

–  Funding of the developments mustn’t be split into parallel channels (platforms, users, industry) but should be kept in a single channel in order to avoid incompatibility and loss of interoperability, even if initiation of the developments can originate from various stakeholders and the aims of the developments always have to be user-centric. Also the overall business models wouldn’t allow such a separation.

3. The NRENs are to be kept as the key actors, and their association is to be considered as the major governing body – also in the followings. In case of doubt, one just has to take a look into the annual Compendium editions of the NRENs’ community. They are impressive and convincing. NRENs not only build, gradually improve, and continuously operate GÉANT (and the national backbones and access networks behind that) but also provide numerous services – an impressive service portfolio for the other e-Infrastructure providers, for the disciplinary Research Infrastructures, and for the extremely wide user communities in research and education. (Their impact on innovation is still to be strengthened but soon that also will be part of the success story.) In this entire picture the role of PRACE or EGI (just to mention two other key role-players in the e-Infrastructure area) is increasing and getting more important but it is interesting to observe that those countries are in the best e-Infrastructure position where the NREN, the NGI and the national supercomputing organisation are coinciding. That’s another reason why the NREN model and the NREN-based governance are real winners.

4. Innovation is a major keyword of our joint aims and goals. (No doubt, innovation, through globally successful new products and services, can be one of the crucial tools of strengthening the economic potential, competitiveness, and also social welfare in Europe). However, an important and frequent misunderstanding or misinterpretation is to be corrected here. While innovation is the primary goal in doing research, the primary aspect in case of e-Infrastructures is stability (which doesn’t mean that innovation within the e-Infrastructure facilities and services could be out of interest …). This observation also leads to how the KPIs (Key Performance Indicators) are to be treated in the coming period. The Compendium (see above) lists an enormous amount of information on our e-Infrastructures and if we want to improve our KPI system then we have to be able to somehow measure the impact of the e-Infrastructures on research and innovation exploiting our e-Infrastructure services, applying them, working with them. (Also the effectiveness of the ERA can’t be measured by how the ERA tools and methods are looking like but rather by how they help research in achieving outstanding results – and Research Infrastructures, together with the e-Infrastructures, are extremely important constituents of that ERA.)

5. The absolute importance of integration rather than separation and fragmentation has already been briefly explained above. Another important requirement is simplification rather than complication in managing and in funding the development and the operation of the e-Infrastructures. No need of new bodies, committees, boards, etc., but rather there is a need to decrease the number of such bodies, if possible, on the basis of the experiences having been collected during the last several years.

Although just a few key points have been briefly investigated, this blog insert is quite long and, due at least partly to the complexity of the issues, probably a bit messy here and there, but hopefully it is not very difficult to follow. However, hopefully the readers, and first of all Augusto himself, will find some interesting and useful details in it – details that are worth to further discuss and to take into consideration when thinking twice about what and how to do when trying to revisit the policies and the funding practice concerning e-infrastructures in Europe.

Thanks again to Augusto for inviting the above contribution to his blog.

About the draft e-Infrastructure Workprogramme 2016-2017

Europe needs trust

The ESFRI list will be updated in 2015-16 and end up to present some tens of major European Research Infrastructures with a high priority and commitment from a number of European countries to drive these RIs to their target. In addition we have a large amount of existing RIs in European level and even more in national level. Altogether this means hundreds of RIs in different areas, from physics experiments to social sciences and humanities – and everything in between.

Think about a scenario, if all these Research Infrastructures would establish their own ICT systems, incompatible and independent of each other. We would not only be wasting a huge amount of resources and re-inventing the wheel for a few hundred times, but also run out of competent people to provide data management services or parallelize code for supercomputers or to develop and run many other services.

How could we avoid this? Horizon2020 program continues the investments from FP7 to RIs and excellence in research. The outcome will be much better in case we can collaborate between researchers and those who provide e-infrastructure and related ICT services, for example national centers or commercial companies. We can do this much more efficiently than today, but it needs something very important – much more trust between different stakeholders. Will the researchers trust that e-infrastructure providers can help them and address their problem instead of only looking after interesting technical challenges? There is a long history with a lot of failures in this approach. Today things are shifting to a better direction, but still the trust needs to be earned through concrete actions.

Horizon2020 provides an excellent opportunity to address this challenge: how to build trust between research and ICT service providers. Some elements are already there, although more could be done. In FP7 already there were a few cluster projects where RIs close to each other worked together and identified common areas, namely CRISP, BioMedBridges, DASISH and ENVRI projects. However, we should go even much beyond what these excellent projects achieved and end up sharing e-infrastructure, services and competence in a much wider basis. The more we share the resources, the more cost efficient the services become – and also higher quality can be reached.

All of this is possible already today. There is no fundamental reason why a Research Infrastructure could not obtain their e-infrastructure from some national center or other service provider. But why is this not done, at least not much? If we want to share the workload optimally and let everyone to concentrate in what they can do best – for example researchers in research and e-infrastructure providers in running the ICT services – we need to build trust between each of them. If we technology providers remember to develop services in a user driven mode, it will help.

Building trust takes time. But we do not have much time, or at least the more time we waste, the bigger danger there is to duplicate efforts.

There is not probably a single wisdom how to fix it all, but at least some actions could be taken to go to the right direction. Here are a few suggestions for you to consider:

  • Build projects where user communities and ICT people work together. The traditional way has been that number of supercomputing or other centers put together a project, develop services and then start to ‘sell’ it to the users. The problems start if users do not want to buy the service, or are already using something else. In case user communities are already partners in the EC funded project and participate in development of the services together with ICT providers, it is very likely that first of all the developed services will be used when ready, and in addition they are likely to be user friendly. One example of such approach is EUDAT (eudat.eu) and as a coordinator of this project I can conclude that it works. The experiences in building trust by working together are very encouraging.
  • Find ways to bring user communities and technology providers together. There are many events suitable for this, but they tend to be populated by us usual suspects. The biggest impact can probably be made by helping a RI which has not been that much involved before. How do we find the potential beneficiaries of the future? E-infrastructure providers need to go where the users are and participate in their events, vice versa it can have less impact.
  • Make the requirements and costs of ICT in Research Infrastructures visible. When the cost is visible and it can be measured, the benefits to do things together can also be shown in practical terms such as money. If you save in IT, maybe you can hire a few more researchers etc. Far too often one can hear comments such as it is cheap to run these systems ourselves since electricity costs nothing (= someone else pays).
  • And finally, make the benefits of collaborating in ICT visible. If ten RIs share the same supercomputer, maybe ten time higher performance can be provided. Or if several groups of researchers need tools to manage their data, maybe it is worthwhile to develop those tools at the same time to all of them.

 

The challenge in building trust is not new to us. Also EC has recognized this and results can be seen in work programs. A lot of excellent work has been done by competent people when building these programs, but I would still like to see more calls where DG RTD (research) and DG Connect (e-infra) work together and this way more calls where clusters of user communities work together with ICT providers. The more we work together, the more we start to trust each other and the better results we will get. I am sure about this since everyone of us wants European research to succeed!

 

Kimmo Koski

Managing Director, CSC – IT Center for Science

Coordinator of EUDAT and EUDAT2020 projects

Kimmo.Koski@csc.fi

 

Europe needs trust