Microsoft is extending a lifeline to developers and data scientists who may be drowning in big data stored on the company’s Azure cloud computing platform.
The company has added integrations that allow Visual Studio Code (VSCode) users to explore and manage files and data objects residing in Azure Data Lake Store accounts and browse Data Lake Analytics metadata.
As its name implies, Azure Data Lake Store enables users to store trillions of objects, including petabyte-size files on big data storage service that is compliant with Apache HDFS (Hadoop Distributed File System). Azure Data Lake Analytics, meanwhile, is an Apache YARN (Yet Another Resource Negotiator) service that enables large-scale analytics.
Now, courtesy of new Azure Data Lake (ADL) integrations, developers who are required to manage information in those services can use Data Lake Explorer within ADL Tools for Visual Studio Code to get a better and quicker grasp of their cloud-based big data environments. Visual Studio Code is Microsoft’s lightweight, cross-platform code editor.
The toolkit streamlines the Azure login experience with auto-login functionality that slashes the time it takes to access pertinent Azure cloud services, explained Jenny Jiang, a principal program manager at Microsoft’s Big Data group, in a Jan. 3 blog post.
Data Lake Explorer users can also examine Azure Data Lake Analytics metadata in a tree-like hierarchical manner while working in U-SQL, as well as create and delete U-SQL database objects, Jiang added. Inspired by Microsoft’s own distributed runtime for big data systems, U-SQL is a query language intended to help .NET and SQL developers quickly get up to speed on big data applications.
Similarly, Data Lake Explorer users can now delve into Azure Data Lake Store, allowing them to preview, download and delete files without leaving Visual Studio Code. To help keep the workspace tidy, a new command (ADL: Set Git Ignore) allows users to exclude system-generated files and folders from their GitHub repositories, Jiang stated.
Installation instructions for the latest Azure Data Lake Tools for Visual Studio Code, along with links to related documentation and tutorials, are available in the Microsoft Azure blog.
It’s not the first time Microsoft has attempted to make its cloud-based big data offerings more accessible to customers.
In November, the cloud provider rolled out a new dashboard experience with improved visibility into Azure Data Lake Analytics account utilization and cost patterns. Headlining the new interface are Power BI-like visualizations showing Analytic Unit hours used (a feature that isn’t accessible by pay-as-you-go customers), any overages and the estimated cost of using the service. Analytics Units are a measure of Azure Data Lake compute resources used.
The updated dashboard also provides a snapshot of Azure Data Lake Analytics activity, including the number of jobs that are currently running, queued up or are in the pipeline, along with the number of job submitters. Finally, an at-a-glance chart allows users to quickly determine if their accounts have enough AUs to run a job.