Environment¶

The containers in which your code will execute are defined by Azure ML Environments. In the simplest instance, you may use pip, Conda, or the Azure ML Python SDK to install custom Python libraries. Custom Docker images can be used if additional customisation is required.

This is a brief guide to reate environment.

From pip requirements.txt file
From Conda env.yml file via the Azure ML Python SDK
From custom Docker image

Azure ML Managed Python Environments¶

In [ ]:

From pip `requirements.txt` file

In [ ]:

from azureml.core import Environment
env = Environment.from_pip_requirements('<env-name>', '<path/to/requirements.txt>')

In [ ]:

from azureml.core import Environment
env = Environment.from_conda_specification('<env-name>', '<path/to/env.yml>')

now by using CondaDependencies¶

In [ ]:

from azureml.core import Environment
from azureml.core.conda_dependencies import CondaDependencies

conda = CondaDependencies()

# add channels
conda.add_channel('pytorch')

# add conda packages
conda.add_conda_package('python=3.7')
conda.add_conda_package('pytorch')
conda.add_conda_package('torchvision')

# add pip packages
conda.add_pip_package('pyyaml')
conda.add_pip_package('mpi4py')
conda.add_pip_package('deepspeed')

# create environment
env = Environment('pytorch')
env.python.conda_dependencies = conda

Custom docker image image/dockerfile creation from Docker Image¶

In [ ]:

env = Environment('<env-name>')
env.docker.base_image = '<image-name>'
env.docker.base_image_registry.address = '<container-registry-address>'
env.docker.base_image_registry.username = '<acr-username>'
env.docker.base_image_registry.password = os.environ.get("CONTAINER_PASSWORD")
# optional
env.python.user_managed_dependencies = True
env.python.interpreter_path = '/opt/miniconda/envs/example/bin/python'

for learning more about the Azure Cotnainer Registry the password and everything can be checked here.

user_managed_dependencies = True : You are responsible for installing all necessary Python libraries, typically in your docker image.

interpreter_path: Only used when user_managed_dependencies=True and sets the Python interpreter path (e.g. which python).It is possible to have Azure ML manage your Python installation when providing a custom base image. For example, using pip requirements.txt

Note: In this case Python libraries installed in Dockerfile will not be available.

In [ ]:

env = Environment.from_pip_requirements('<env-name>', '<path/to/requirements.txt>')
env.docker.base_dockerfile = './Dockerfile'

I strongly recommend building your docker image from one of the Azure ML base images available here: AzureML-Containers GitHub Repo.

GPU Docker Images are listed here
CPU Docker Images are listed here
For the Lite version of the OS(for some Edge devices) Images are listed here

If you want a speific version of the CUDA image then hek out the docker hub here

Conda: Azure ML uses Conda by default to manage the Python environment. If you plan to allow Azure ML to manage your Python environment, you need Conda.

libfuse: is required when using the dataset.

Openmpi:is required for distributed operation.

nvidia / cuda: (recommended) for training and building images based on nvidia / cuda GPU

Mellanox OFED user space driver (recommended) for use with Infiniband SKU

This above metoda hols good for the public registry but for the private registry the code is given below.

In [ ]:

env = Environment('<env-name>')
env.docker.base_image = "/my/private/img:tag",  # image repository path
env.docker.base_image_registry.address = "myprivateacr.azurecr.io"  # private registry

# Retrieve username and password from the workspace key vault
env.docker.base_image_registry.username = ws.get_default_keyvault().get_secret("username")  
env.docker.base_image_registry.password = ws.get_default_keyvault().get_secret("password")

Thus finally if we want to register the environment.

In [ ]:

env.register(ws)

The registerd workspace can be obtained from the workspace handle.

In [ ]:

envs: Dict[str, Environment] = ws.environments

The sample code can be seen here

In [ ]:

# create / update, register environment
env = Environment.from_pip_requirements('my-env', 'requirements.txt')
env.register(ws)

# use later
env = ws.environments['my-env']

# get a specific version
env = Environment.get(ws, 'my-env', version=6)

To Save and Load Environment

In [ ]:

env.save_to_directory('<path/to/local/directory>', overwrite=True)

This will generate a directory with two files(human-understandable and editable):

azureml_environment.json : Metadata including name, version, environment variables and Python and Docker configuration conda_dependencies.yml : Standard conda dependencies YAML (for more details see Conda docs).

To load the enviroment for the future experiments you can use the code below

In [ ]:

env = Environment.load_from_directory('<path/to/local/directory>')

In [ ]:

env = Environment('example')
env.environment_variables['EXAMPLE_ENV_VAR'] = 'EXAMPLE_VALUE'

Azure ML will verify whether the same environment has already been materialised into a docker image in the Azure Container Registry connected with the Azure ML workspace when the conda dependencies are handled by Azure ML (user managed dependencies=False, by default). If this is a new environment, Azure ML will have a task preparation step where it will create a new docker image for it.In the logs, you'll find an image build log file, which you can use to track the progress of the image construction. The task will not begin until the image has been created and uploaded to the container registry.

This picture creation procedure may take some time, delaying the commencement of your task. Consider the following to minimise needless picture creation:

Create an environment with the majority of the packages you'll need and reuse it wherever feasible.
If you simply want a few additional items to be installed on top of an existing environment,
- Use a dockerfile from the existing environment if it's a docker image, so you just have to add one layer to install a few more packagers.
- Install additional Python packages in your user script so that they are installed as part of your code rather than having Azure ML consider them as part of a new environment. Use a setup script if possible.

If it's a docker image, use a dockerfile from the existing environment, so you just have to add one layer to install a few additional packagers. Install extra Python packages in your user script so that Azure ML recognises them as part of your code rather than as part of a separate environment. If at all feasible, use a setup script.

To build a Docker Image locally and push to Azure Container Registry¶

In [ ]:

from azureml.core import Environment
myenv = Environment(name='<env-name>')
registered_env = myenv.register(ws)
registered_env.build_local(ws, useDocker=True, pushImageToWorkspaceAcr=True)

If you have Docker installed locally, you can create a Docker image from the Azure Machine Learning environment and upload it to workspace ACR directly. Because local builds can use cached layers, this is suggested when users iterate on the dockerfile.

in the same working directory the bootstrap.sh file will look like.

echo "Running bootstrap.sh"
pip install torch==1.8.0+cu111
MARKER="/tmp/.azureml_bootstrap_complete"

if [[ $AZ_BATCHAI_TASK_INDEX = 0 ]] ; then    
    echo "Running bootstrap.sh"
    echo "Installing transformers from source"
    pip install git+https://github.com/huggingface/transformers
    python -c "from transformers import pipeline; print(pipeline('sentiment-analysis')('we love you'))"
    pip install datasets
    pip install tensorflow
    echo "Installation complete"
    touch $MARKER
fi
echo "Barrier..."
while [[ ! -f $MARKER ]]
do
    sleep 1
done
echo "Bootstrap complete!"

To this to have a run ahead in the training script train.py make use of the commands given below.

In [ ]:

cmd = "bash bootstrap.sh && python train.py --learning_rate 1e-5".split()

config = ScriptRunConfig(
    source_directory='<path/to/code>',
    command=command,
    compute_target=compute_target,
    environment=environment,
)

Now we can look at the Azure Key Vault

In [ ]:

!pip install azure-identity azure-keyvault

For the Workspace Default Keyvault

A keyvault is included with each Azure workspace (you can find this in the Azure Portal under the same resource group as your Workspace).

In [ ]:

from azureml.core import Workspace

ws = Workspace.from_config()
kv = ws.get_default_keyvault()

The below srpt will both get adn set the secrets.

In [ ]:

import os
from azureml.core import Keyvault

# add a secret to keyvault
kv.set_secret(name="<my-secret>", value=os.environ.get("MY_SECRET"))

# get a secret from the keyvault
secret = kv.get_secret(name="<my-secret>")

# equivalently
secret = run.get_secret(name="<my-secret>")

You can get a Generic keyvault Service from here:

In [ ]:

from azure.identity import DefaultAzureCredential
from azure.keyvault.secrets import SecretClient

credential = DefaultAzureCredential()
client = SecretClient(vault_url=kv_url, credential=credential)
my_secret = client.get_secret(secret_name).value

env = Environment('example')
env.environment_variables['POWERFUL_SECRET'] = my_secret

we have to be sure to add azure-identity and azure-keyvault to your projects requirements in this case.

In [ ]:

# Code in submitted run
from azureml.core import Experiment, Run

run = Run.get_context()
secret_value = run.get_secret(name="mysecret")

In [ ]:

from azureml.core import Workspace
from azureml.core import Keyvault
import os


ws = Workspace.from_config()
my_secret = os.environ.get("MY_SECRET")
keyvault = ws.get_default_keyvault()
keyvault.set_secret(name="mysecret", value = my_secret)

Be sure to add azure-identity and azure-keyvault to your projects requirements in this case.

In [ ]: