The success or failure for enterprises looking to deploy large-scale cloud-first development projects has very little to do with the development of actual solutions.
Whether the solution is something as simple as an AI-driven customer support chatbot or as complex as an automated global supply chain, success will depend upon work done very early on to build a solid underlying data architecture that will power the ultimate solution.
Data is the common currency of innovation. It gives shape to and defines the ultimate value of any and all meaningful enterprise projects. When combined with artificial intelligence (AI) and human insight (analytics), data can truly transform an enterprise.
But wrangling data at scale in support of a solution that might impact the entire business is not easy. Enterprise practitioners must gather enough “trusted” data to get the job done. They must apply the data science expertise necessary to understand and take action on that data. And they need to somehow put those actions into production at scale and over time.
There are many moving pieces involved in accomplishing such a formidable task, but there are some emerging, key technology areas that we believe will play a major role in driving the success of any large-scale, data-driven initiative in 2020.
Data catalogues: putting data intelligence over business intelligence
Business intelligence (BI) has served as a core enterprise competency for more than 40 years, but lately, it has shown its age in failing to keep up with the speed and variety of data that must be analysed in order to drive meaningful business decisions.
Lightweight data visualisation tools have done much to modernise BI, putting analytics into the hands of a wide array of decision-makers. But even those tools have failed to keep pace and have often created more problems than they solve by encouraging the free use of personal, ungoverned data sources.
This has led to an intense focus on data management and governance. Vendor communities (BI/data visualisation, data integration, and cloud platform) are therefore seeking to help enterprise customers prioritise the front end of the data and analytics pipeline – specifically the ingestion and registration of data.
In short, customers need a better, more flexible means of ingesting, registering, validating and distributing data sources. Enterprise practitioners should invest in tools such as Microsoft Azure Data Catalog and Tableau Data Catalog, which can bring the focus back to the front-end of the pipeline without enforcing any draconian data warehousing requirements.
Development environments: bringing AI and DevOps together
The accelerated use of artificial intelligence (AI) across the technology landscape has reached a point where AI can no longer be considered an isolated enterprise endeavour. AI technologies, be those data pipelines, AI frameworks, development tools, platforms, or even AI-accelerated hardware, are all readily available and largely operationalised for widespread and rapid enterprise adoption.
Unfortunately, this embarrassment of riches brings with it a host of AI-specific operational complexities that get in the way of the DevOps ideal of continuous deployment, delivery and integration.
Enterprise buyers, therefore, should look for tools that can unify this life-cycle – tools such as AWS Sagemaker, which has offered to shuffle machine learning (ML) models between production and development for some time. Recently, AWS took another important step toward what can be called MLOps with the introduction of a fully integrated development environment, Amazon SageMaker Studio. This single solution will let developers and data scientists work side-by-side, writing code, managing experiments, visualising data and managing deployed solutions.
Cloud object storage: demanding cross-cloud data portability
In terms of storing a lot of dynamic data at scale, cloud-borne object cloud storage solutions such as Amazon Simple Storage Service (S3) have a lot to offer and have quickly taken up the reigns as the data lake of choice for many customers. Customers need not worry about what kind of data lands in an S3 bucket. Whether it’s structured, unstructured or time-sensitive, all data can now be managed within a single architecture. Such data repositories, however, are anything but open, locking customers into the host platform provider.
Enterprises, therefore, should look for solutions that can afford some sort of portability. The trouble is that at present, there aren’t many options as these cloud providers have been slow to open up their important data stores. This is changing, however, with interoperability efforts such as Microsoft’s AzCopy (which copies S3 buckets to Azure Blob Storage). Even this, however, stops short, focusing on migration rather than on interoperability. Fortunately, there are some options emerging, such as the 3rd party MinIO Object Storage service, which has the potential to serve as a true layer of compatibility spanning multiple object stores.
NS Tech and GlobalData are part of the same group