Building a Text Categorization API with TensorFlow.NET in ASP.NET Core

Building a Text Categorization API with TensorFlow.NET in ASP.NET Core

Building a Text Categorization API with TensorFlow.NET in ASP.NET Core

Introduction

In the design of modern web applications, text categorization is a crucial feature that can help in organizing, filtering, and understanding large volumes of textual data. This tutorial will guide you through creating a powerful text categorization API using TensorFlow.NET and ASP.NET Core. By the end, you’ll have a functional API that can categorize text using a pre-trained TensorFlow model.

Prerequisites

Before we go in, ensure you have the following:

  • Basic knowledge of ASP.NET Core.
  • A working installation of .NET SDK.
  • Python (for training the TensorFlow model).

Step 1: Set Up Your ASP.NET Core Web API Project

Start by creating a new ASP.NET Core Web API project:

dotnet new webapi -n TextCategorizationApi
cd TextCategorizationApi

Open the project in your preferred IDE (e.g., Visual Studio or Visual Studio Code).

Step 2: Add Necessary Packages

Next, install TensorFlow.NET and the required dependencies:

dotnet add package TensorFlow.NET
dotnet add package SciSharp.TensorFlow.Redist

Step 3: Prepare Your TensorFlow Model

If you don’t have a pre-trained TensorFlow model, you can train one using Python. Here’s an example script to train and save a text classification model:

import tensorflow as tf
from tensorflow.keras.preprocessing.text import Tokenizer
from tensorflow.keras.preprocessing.sequence import pad_sequences
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Embedding, LSTM, Dense
import numpy as np

# Example training data
texts = ["This is positive", "This is negative", "Very happy", "Very sad"]
labels = [1, 0, 1, 0]

tokenizer = Tokenizer(num_words=1000)
tokenizer.fit_on_texts(texts)
sequences = tokenizer.texts_to_sequences(texts)
data = pad_sequences(sequences, maxlen=10)

labels = np.array(labels)

model = Sequential([
    Embedding(1000, 64, input_length=10),
    LSTM(64),
    Dense(1, activation='sigmoid')
])

model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
model.fit(data, labels, epochs=5)

# Save the model
model.save('text_categorization_model')

Convert and save the model in TensorFlow SavedModel format:

model.save('saved_model')

Step 4: Implement the TensorFlow Service in ASP.NET Core

Create a service to load and use the TensorFlow model:

// Services/TensorFlowService.cs
using System;
using System.Linq;
using NumSharp;
using Tensorflow;
using static Tensorflow.Binding;
using TextCategorizationApi.Models;

public class TensorFlowService
{
    private readonly string _modelPath;
    private readonly Session _session;
    private readonly Graph _graph;

    public TensorFlowService(string modelPath)
    {
        _modelPath = modelPath;
        _graph = new Graph().as_default();
        _session = tf.Session(_graph);
        tf.train.import_meta_graph($"{_modelPath}/saved_model.meta");
        tf.train.Saver().restore(_session, _modelPath);
    }

    public string PredictCategory(string text)
    {
        // Tokenize and pad the input text as in the training script
        var tokens = TokenizeAndPad(text);
        var inputTensor = _graph.OperationByName("input_1"); // Adjust according to your model's input name
        var outputTensor = _graph.OperationByName("dense_2/Sigmoid"); // Adjust according to your model's output name

        var result = _session.run(outputTensor, new FeedItem(inputTensor, tokens));

        // Assume binary classification: 0 or 1
        return result[0].ToString() == "1" ? "Positive" : "Negative";
    }

    private NDArray TokenizeAndPad(string text)
    {
        // Implement the same tokenization and padding logic as in the Python script
        // This is a simplified version, you may need to adjust it based on your actual tokenizer
        var tokens = text.Split(' ').Select(word => (float)word.GetHashCode() % 1000).ToArray();
        var paddedTokens = new float[10]; // Adjust to your max sequence length
        Array.Copy(tokens, paddedTokens, Math.Min(tokens.Length, paddedTokens.Length));
        return np.array(new[] { paddedTokens });
    }
}

Step 5: Register the TensorFlow Service in Startup.cs

Add the following code to register the TensorFlow service:

public void ConfigureServices(IServiceCollection services)
{
    services.AddControllers();
    var modelPath = "path_to_your_model";
    services.AddSingleton(new TensorFlowService(modelPath));
}

Step 6: Create a Controller to Use the TensorFlow Service

Create a controller to handle predictions:

// Controllers/PredictController.cs
using Microsoft.AspNetCore.Mvc;
using TextCategorizationApi.Models;
using TextCategorizationApi.Services;

[Route("api/[controller]")]
[ApiController]
public class PredictController : ControllerBase
{
    private readonly TensorFlowService _tensorFlowService;

    public PredictController(TensorFlowService tensorFlowService)
    {
        _tensorFlowService = tensorFlowService;
    }

    [HttpPost]
    public ActionResult Predict([FromBody] TextData input)
    {
        var category = _tensorFlowService.PredictCategory(input.Text);
        return Ok(new { input.Text, Category = category });
    }
}

Step 7: Test Your API

Run your API:

dotnet run

Use a tool like Postman or cURL to send requests to your API endpoint:

Example Request Using Postman:

  • URL: http://localhost:5000/api/predict
  • Method: POST
  • Body:
    {
        "text": "This is a sample text to categorize."
    }
    

Conclusion

In this article, we’ve walked through creating a text categorization API using TensorFlow.NET and ASP.NET Core. By leveraging the power of TensorFlow’s machine learning capabilities, you can enhance your .NET applications with advanced text analysis features. Whether you’re organizing user feedback, filtering content, or gaining insights from textual data, this API provides a robust foundation for your text categorization needs.

Voltar para o blogue
  • ChatGPT Uncovered Podcast

    Podcast descoberto do ChatGPT

    Pedro Martins

    Podcast descoberto do ChatGPT Podcast descoberto do ChatGPT Explorando as fronteiras dos modelos de conversação de IA Episódio 1: Compreendendo o ChatGPT Publicado em: 15 de maio de 2023 Seu...

    Podcast descoberto do ChatGPT

    Pedro Martins

    Podcast descoberto do ChatGPT Podcast descoberto do ChatGPT Explorando as fronteiras dos modelos de conversação de IA Episódio 1: Compreendendo o ChatGPT Publicado em: 15 de maio de 2023 Seu...

  • Power Apps In-Depth Podcast

    Podcast detalhado do Power Apps

    Pedro Martins

    Podcast detalhado do Power Apps Podcast detalhado do Power Apps Explorando os recursos do Microsoft Power Apps Episódio 1: Introdução ao Power Apps Publicado em: 20 de abril de 2023...

    Podcast detalhado do Power Apps

    Pedro Martins

    Podcast detalhado do Power Apps Podcast detalhado do Power Apps Explorando os recursos do Microsoft Power Apps Episódio 1: Introdução ao Power Apps Publicado em: 20 de abril de 2023...

  • Exploring Power Pages Podcast

    Explorando o podcast Power Pages

    Pedro Martins

    Explorando o podcast Power Pages Explorando o podcast Power Pages Mergulhando no mundo das Power Pages da Microsoft Episódio 1: Primeiros passos com Power Pages Publicado em: 10 de março...

    Explorando o podcast Power Pages

    Pedro Martins

    Explorando o podcast Power Pages Explorando o podcast Power Pages Mergulhando no mundo das Power Pages da Microsoft Episódio 1: Primeiros passos com Power Pages Publicado em: 10 de março...

1 de 3