Skip to main content

Using the OnPremiseClassifier

The OnPremiseClassifier provides a scikit-learn compatible interface for using Neuralk’s In-Context Learning model with an on-premise or self-hosted NICL server. It’s ideal for users who have deployed NICL on their own infrastructure or need to work with on-premise deployments.

NOTE

For this example to run, you need access to a running NICL server. The server URL should be specified via the host parameter when initializing the classifier.

Simple example on toy data

We start by using the OnPremiseClassifier on simple data that needs no preprocessing.

Generate simple data:

import numpy as np
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

X, y = make_classification(n_samples=200, n_features=10, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Ensure data is in the correct format (float32 for features, int64 for labels)
X_train = X_train.astype(np.float32)
X_test = X_test.astype(np.float32)
y_train = y_train.astype(np.int64)

print(f"{X_train.shape=} {y_train.shape=} {X_test.shape=} {y_test.shape=}")
X_train.shape=(160, 10) y_train.shape=(160,) X_test.shape=(40, 10) y_test.shape=(40,)

Now we apply Neuralk’s OnPremiseClassifier.

NOTE

Replace “http://localhost:8000” with the actual URL of your NICL server. If your server requires authentication, you can pass it via the default_headers parameter.

from sklearn.metrics import accuracy_score

from neuralk import OnPremiseClassifier

# Initialize the classifier with your NICL server URL
# Replace with your actual server URL
classifier = OnPremiseClassifier(
host="http://localhost:8000", # Replace with your NICL server URL
model="nicl-small",
timeout_s=300,
)

# Note: nothing actually happens during fit() -- in-context learning models are
# pretrained but require no fitting on our specific dataset. The fit method
# only stores the training data.
classifier = classifier.fit(X_train, y_train)

# Make predictions
predictions = classifier.predict(X_test)
probabilities = classifier.predict_proba(X_test)

accuracy = accuracy_score(y_test, predictions)
print(f"Accuracy: {accuracy}")
print(f"Predictions shape: {predictions.shape}")
print(f"Probabilities shape: {probabilities.shape}")
Accuracy: 0.875
Predictions shape: (40,)
Probabilities shape: (40, 2)

Working with authentication

If your NICL server requires authentication, you can pass authentication headers via the default_headers parameter:

Example with authentication headers

classifier_with_auth = OnPremiseClassifier(
host="http://localhost:8000",
model="nicl-small",
default_headers={"Authorization": "Bearer your-token"},
)

Advanced configuration

The OnPremiseClassifier supports various configuration options for fine-tuning the connection and request behavior:

Example with advanced configuration

classifier_advanced = OnPremiseClassifier(
host="http://localhost:8000",
dataset_name="my-dataset",
model="nicl-large", # Use a different model
timeout_s=600, # Longer timeout
metadata={"source": "example", "version": "1.0"},
user="user123",
api_version="v1",
)

Integration with scikit-learn pipelines

Like the cloud-based Classifier, OnPremiseClassifier can be integrated into scikit-learn pipelines:

from sklearn.decomposition import PCA
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler

# Create a pipeline with preprocessing and OnPremiseClassifier
pipeline = make_pipeline(
StandardScaler(),
PCA(n_components=5),
OnPremiseClassifier(
host="http://localhost:8000",
model="nicl-small",
),
)

# Fit and predict
pipeline.fit(X_train, y_train)
pipeline_predictions = pipeline.predict(X_test)
pipeline_accuracy = accuracy_score(y_test, pipeline_predictions)
print(f"Pipeline accuracy: {pipeline_accuracy}")
Pipeline accuracy: 0.85

Total running time of the script: (0 minutes 0.610 seconds)