Data Storage

An overview of the BLOB storage and PostgreSQL database.

Overview

Data is stored in two ways: a BLOB storage for media and files, and a database for parameters and reports.

BLOB Storage

BLOB storage is used to store images, videos and files. These items are stored in containers. The first item that needs to be stored is the media. Because of this, the front-end creates a container when media is submitted.

Uploading

    def save_to_blob(self, file_name, uploaded_file_url): 
        # Connect 
        connect_str = os.getenv('AZURE_STORAGE_CONNECTION_STRING') 
        blob_service_client = BlobServiceClient.from_connection_string(connect_str)
        
        # Generate name and create container
        container_name = self.generate_name(10).lower()
        blob_service_client.create_container(container_name)
    
        # Upload media
        blob_client = blob_service_client.get_blob_client(container=container_name, blob=file_name)
        with open(uploaded_file_url, "rb") as data:
            blob_client.upload_blob(data)

The first step is to connect to the Azure storage account using the connection string provided in Azure. This is saved as an environment variable in this case.

To create a container, a name has to be provided for it. Each job (instance of media upload) in the pipeline has an associated UUID - in this case, the generate_name() method. This can be used throughout the pipeline to track what container is meant in logs.

Once a container has been created, the uploaded media can be stored inside it. The upload completing triggers the Azure function, starting the next service in the pipeline.

Downloading

def get_blob(self, name):
    # Connect to storage
    connect_str = os.getenv('AZURE_STORAGE_CONNECTION_STRING')
    blob_service_client = BlobServiceClient.from_connection_string(connect_str)
    
    # Determine which file is an image
    container_client = blob_service_client.get_container_client(name)
    media_name = self.media_name_from_blob(container_client)
    
    # Get and download blob
    blob_client = blob_service_client.get_blob_client(container=name, blob=media_name)
    with open(media_name, "wb") as download_file:
        download_file.write(blob_client.download_blob().readall())
    
    return media_name

This function (logging omitted for brevity) downloads the image from the container and stores it locally. It then returns the name of the image for the analysis.

Database

The database is a simple PostGreSQL database used to store parameter sets and the data from the analysis. An overview of the structure can be seen here.

Items are stored in the database using the Django ORM. It works by creating models that represent the tables in the database. These tables are automatically generated based on the models. Multiple services can access the same database (in this case the front-end and the reporting module) as long as they use the same set of models. An example of a model can be seen here:

class ActionUnitReport(models.Model):
    class ReportType(models.TextChoices):
        IMAGE_POSED = 'IP', _('Image - Posed')
        IMAGE_WILD = 'IW', _('Image - Wild')
        VIDEO_SINGLE = 'VS', _('Video - Single')
        VIDEO_MUTLIPLE = 'VM', _('Video - Multiple')

    name = models.CharField(max_length=100, default='default', null=False)
    url = models.CharField(max_length=300, null=False)
    date = models.DateField('date added', default=datetime.date.today, null=False)
    avatar = models.CharField(max_length=50, null=False)
    type = models.CharField(
        max_length=2,
        choices= ReportType.choices,
        default=ReportType.IMAGE_POSED,
        null=False
    )

    def __str__(self):
        return "%s %s" % (self.url, self.date)

    class Meta:
        db_table = "action_unit_report"

This model represents the collection of OpenFace data. The only noteworthy field is type: it is a sort of enum as defined by the ReportType class. By defining the Meta class, it is possible to define the name of the database table too.

Last updated

Was this helpful?