U-SQL Advanced Analytics: Introducing Cognitive scenarios for Text and Imaging


Yesterday we introduced you to U-SQL Advanced Analytics and showed how Python can be used with U-SQL. Today, we'll show U-SQL's built-in support for Cognitive scenarios for images and text.

Currently U-SQL Supports these cognitive scenarios:

  • Detecting Objects in Images (Tagging)
  • Detecting Emotion in Faces in Images
  • Detecting Text in Images (OCR)
  • Text Key Phrase Extraction
  • Text Sentiment Analysis

Over time, we'll add more support and enhance the integration in many ways.

Here's an example of how U-SQL can be used to detect objects in images:

REFERENCE ASSEMBLY ImageCommon;
REFERENCE ASSEMBLY FaceSdk;
REFERENCE ASSEMBLY ImageEmotion;
REFERENCE ASSEMBLY ImageTagging;
REFERENCE ASSEMBLY ImageOcr;

@imgs =
    EXTRACT FileName string, ImgData byte[]
    FROM @"/images/{FileName:*}.jpg"
    USING new Cognition.Vision.ImageExtractor();

// Extract the number of objects on each image and tag them 
@objects =
    PROCESS @imgs 
    PRODUCE FileName,
            NumObjects int,
            Tags string
    READONLY FileName
    USING new Cognition.Vision.ImageTagger();

OUTPUT @objects 
    TO "/objects.tsv"
    USING Outputters.Tsv();

In this sample the Cognition.Vision.ImageTagger() processor is used to detect objects and place a text description of them in the Tags column.

 

Here's an example of how U-SQL can be used to understand text:

REFERENCE ASSEMBLY [TextCommon];
REFERENCE ASSEMBLY [TextSentiment];
REFERENCE ASSEMBLY [TextKeyPhrase];

@WarAndPeace =
    EXTRACT No int,
            Year string,
            Book string,
            Chapter string,
            Text string
    FROM @"/usqlext/samples/cognition/war_and_peace.csv"
    USING Extractors.Csv();

@sentiment =
    PROCESS @WarAndPeace
    PRODUCE No,
            Year,
            Book,
            Chapter,
            Text,
            Sentiment string,
            Conf double
    READONLY No,
             Year,
             Book,
             Chapter,
             Text
    USING new Cognition.Text.SentimentAnalyzer(true);

OUTPUT @sentiment 
    TO "/sentiment.tsv"
    USING Outputters.Tsv();

In this sample the Cognition.Text.SentimentAnalyzer() processor is used to detect objects and place a text description of them in the Sentiment column.

To learn more about our support for U-SQL Advanced Analytics and how to enable it in your Data Lake Analytics Accounts, see our Getting Started guide.


Comments (5)

  1. Shaun says:

    This is awesome, really amazing stuff you guys have done with Data Lake overall and stuff like this just brings it to the next level. Looking forward to what's coming next!

  2. shirley says:

    I put an image file under the ADL store /images/ dictionary, and copy the exact code of object detection code, but got the following error messages. Can you help here?
    Description
    All statements in a script must eventually lead to an output.
    Resolution
    Either remove this statement from the script or use its result in the script.
    Details
    at token 'EXTRACT', line 7

    near the ###:

    **************

    RENCE ASSEMBLY ImageCommon;
    REFERENCE ASSEMBLY FaceSdk;
    REFERENCE ASSEMBLY ImageEmotion;
    REFERENCE ASSEMBLY ImageTagging;
    REFERENCE ASSEMBLY ImageOcr;

    @imgs = ### EXTRACT FileName string, ImgData byte[]
    FROM @"/images/{FileName:*}.jpg"
    USING new Cognition.Vision.ImageExtractor();

    // Extract the number of objects
    Error
    E_CSC_USER_REDUNDANTSTATEMENTINSCRIPT

    1. shirley says:

      The problem got resolved with the latest U-SQL in this post. Thanks a lot, Saveen.

  3. Rakesh says:

    I have a jpeg file that contains text but the output .csv file is all blank. The dll works with sample images but doesnt with mine.
    Is there a specification of jpeg file that ocr can read/detect ?

    1. hiren says:

      Rakesh,
      Can you please describe your scenario I more details what exactly you are trying to do.

Skip to main content