Building a Text Categorization API with TensorFlow.NET in ASP.NET Core

Building a Text Categorization API with TensorFlow.NET in ASP.NET Core

Building a Text Categorization API with TensorFlow.NET in ASP.NET Core

Introduction

In the design of modern web applications, text categorization is a crucial feature that can help in organizing, filtering, and understanding large volumes of textual data. This tutorial will guide you through creating a powerful text categorization API using TensorFlow.NET and ASP.NET Core. By the end, you’ll have a functional API that can categorize text using a pre-trained TensorFlow model.

Prerequisites

Before we go in, ensure you have the following:

  • Basic knowledge of ASP.NET Core.
  • A working installation of .NET SDK.
  • Python (for training the TensorFlow model).

Step 1: Set Up Your ASP.NET Core Web API Project

Start by creating a new ASP.NET Core Web API project:

dotnet new webapi -n TextCategorizationApi
cd TextCategorizationApi

Open the project in your preferred IDE (e.g., Visual Studio or Visual Studio Code).

Step 2: Add Necessary Packages

Next, install TensorFlow.NET and the required dependencies:

dotnet add package TensorFlow.NET
dotnet add package SciSharp.TensorFlow.Redist

Step 3: Prepare Your TensorFlow Model

If you don’t have a pre-trained TensorFlow model, you can train one using Python. Here’s an example script to train and save a text classification model:

import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense
import numpy as np

# Example training data
texts = ["This is positive", "This is negative", "Very happy", "Very sad"]
labels = [1, 0, 1, 0]

tokenizer = Tokenizer(num_words=1000)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
data = pad_sequences(sequences, maxlen=10)

labels = np.array(labels)

model = Sequential([
    Embedding(1000, 64, input_length=10),
    LSTM(64),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(data, labels, epochs=5)

# Save the model
model.save('text_categorization_model')

Convert and save the model in TensorFlow SavedModel format:

model.save('saved_model')

Step 4: Implement the TensorFlow Service in ASP.NET Core

Create a service to load and use the TensorFlow model:

// Services/TensorFlowService.cs
using System;
using System.Linq;
using NumSharp;
using Tensorflow;
using static Tensorflow.Binding;
using TextCategorizationApi.Models;

public class TensorFlowService
{
    private readonly string _modelPath;
    private readonly Session _session;
    private readonly Graph _graph;

    public TensorFlowService(string modelPath)
    {
        _modelPath = modelPath;
        _graph = new Graph().as_default();
        _session = tf.Session(_graph);
        tf.train.import_meta_graph($"{_modelPath}/saved_model.meta");
        tf.train.Saver().restore(_session, _modelPath);
    }

    public string PredictCategory(string text)
    {
        // Tokenize and pad the input text as in the training script
        var tokens = TokenizeAndPad(text);
        var inputTensor = _graph.OperationByName("input_1"); // Adjust according to your model's input name
        var outputTensor = _graph.OperationByName("dense_2/Sigmoid"); // Adjust according to your model's output name

        var result = _session.run(outputTensor, new FeedItem(inputTensor, tokens));

        // Assume binary classification: 0 or 1
        return result[0].ToString() == "1" ? "Positive" : "Negative";
    }

    private NDArray TokenizeAndPad(string text)
    {
        // Implement the same tokenization and padding logic as in the Python script
        // This is a simplified version, you may need to adjust it based on your actual tokenizer
        var tokens = text.Split(' ').Select(word => (float)word.GetHashCode() % 1000).ToArray();
        var paddedTokens = new float[10]; // Adjust to your max sequence length
        Array.Copy(tokens, paddedTokens, Math.Min(tokens.Length, paddedTokens.Length));
        return np.array(new[] { paddedTokens });
    }
}

Step 5: Register the TensorFlow Service in Startup.cs

Add the following code to register the TensorFlow service:

public void ConfigureServices(IServiceCollection services)
{
    services.AddControllers();
    var modelPath = "path_to_your_model";
    services.AddSingleton(new TensorFlowService(modelPath));
}

Step 6: Create a Controller to Use the TensorFlow Service

Create a controller to handle predictions:

// Controllers/PredictController.cs
using Microsoft.AspNetCore.Mvc;
using TextCategorizationApi.Models;
using TextCategorizationApi.Services;

[Route("api/[controller]")]
[ApiController]
public class PredictController : ControllerBase
{
    private readonly TensorFlowService _tensorFlowService;

    public PredictController(TensorFlowService tensorFlowService)
    {
        _tensorFlowService = tensorFlowService;
    }

    [HttpPost]
    public ActionResult Predict([FromBody] TextData input)
    {
        var category = _tensorFlowService.PredictCategory(input.Text);
        return Ok(new { input.Text, Category = category });
    }
}

Step 7: Test Your API

Run your API:

dotnet run

Use a tool like Postman or cURL to send requests to your API endpoint:

Example Request Using Postman:

  • URL: http://localhost:5000/api/predict
  • Method: POST
  • Body:
    {
        "text": "This is a sample text to categorize."
    }
    

Conclusion

In this article, we’ve walked through creating a text categorization API using TensorFlow.NET and ASP.NET Core. By leveraging the power of TensorFlow’s machine learning capabilities, you can enhance your .NET applications with advanced text analysis features. Whether you’re organizing user feedback, filtering content, or gaining insights from textual data, this API provides a robust foundation for your text categorization needs.

Back to blog
  • ChatGPT Uncovered Podcast

    ChatGPT Uncovered Podcast

    Pedro Martins

    ChatGPT Uncovered Podcast ChatGPT Uncovered Podcast Exploring the Frontiers of AI Conversational Models Episode 1: Understanding ChatGPT Published on: May 15, 2023 Your browser does not support the audio element....

    ChatGPT Uncovered Podcast

    Pedro Martins

    ChatGPT Uncovered Podcast ChatGPT Uncovered Podcast Exploring the Frontiers of AI Conversational Models Episode 1: Understanding ChatGPT Published on: May 15, 2023 Your browser does not support the audio element....

  • Power Apps In-Depth Podcast

    Power Apps In-Depth Podcast

    Pedro Martins

    Power Apps In-Depth Podcast Power Apps In-Depth Podcast Exploring the Capabilities of Microsoft Power Apps Episode 1: Introduction to Power Apps Published on: April 20, 2023 Your browser does not...

    Power Apps In-Depth Podcast

    Pedro Martins

    Power Apps In-Depth Podcast Power Apps In-Depth Podcast Exploring the Capabilities of Microsoft Power Apps Episode 1: Introduction to Power Apps Published on: April 20, 2023 Your browser does not...

  • Exploring Power Pages Podcast

    Exploring Power Pages Podcast

    Pedro Martins

    Exploring Power Pages Podcast Exploring Power Pages Podcast Delving into the World of Microsoft Power Pages Episode 1: Getting Started with Power Pages Published on: March 10, 2023 Your browser...

    Exploring Power Pages Podcast

    Pedro Martins

    Exploring Power Pages Podcast Exploring Power Pages Podcast Delving into the World of Microsoft Power Pages Episode 1: Getting Started with Power Pages Published on: March 10, 2023 Your browser...

1 of 3