--- license: mit datasets: - gretelai/synthetic_text_to_sql language: - en base_model: - meta-llama/Llama-3.2-3B-Instruct library_name: transformers tags: - text-to-sql --- ## Model Details This model is a fine-tuned version of Llama-3.2-3B-Instruct designed specifically for Text-to-SQL tasks. It was trained to accept a database schema and a natural language question, and output a valid SQL query along with a brief explanation of the logic. It is lightweight (3B parameters), making it suitable for local deployment on consumer GPUs using 4-bit quantization. ### Model Description 1) Base Model: unsloth/Llama-3.2-3B-Instruct 2) Fine-tuning Framework: Unsloth (QLoRA) 3) Dataset: gretelai/synthetic_text_to_sql ## Uses The model was trained using the Alpaca prompt format. For best results, structure your input as follows: ![image](/static-proxy?url=https%3A%2F%2Fcdn-uploads.huggingface.co%2Fproduction%2Fuploads%2F656b8d33e8bf55919a6aa345%2F1XEAfrZ5iU7doBDy_T5ef.png) ## How to Get Started with the Model ```python import torch from transformers import pipeline model_id = "Ary-007/Text-to-sql-llama-3.2" # Load the pipeline pipe = pipeline( "text-generation", model=model_id, device_map="auto", ) # Define the schema (Context) schema = """ CREATE TABLE employees ( id INT, name TEXT, department TEXT, salary INT, hire_date DATE ); """ # Define the user question question = "Find the name and salary of employees in the 'Engineering' department who earn more than 80000." # Format the prompt exactly as trained prompt = f"""Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request. ### Instruction: Company Database : {schema} ### Input: SQL Prompt :{question} ### Response: """ outputs = pipe( prompt, max_new_tokens=200, do_sample=True, temperature=0.1, top_p=0.9 ) print(outputs[0]["generated_text"]) ``` ## Training Details The model was fine-tuned using Unsloth on a Tesla T4 GPU (Google Colab). Hyperparameters 1) Rank (r): 16 2) LoRA Alpha: 16 3) Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj 4) Quantization: 4-bit (Normal Float4) 5) Max Sequence Length: 2048 6) Learning Rate: 2e-4 7) Optim: adamw_8bit 8) Max Steps: 60 ## Dataset Info The model was trained on the gretelai/synthetic_text_to_sql dataset, utilizing the following fields: 1) sql_context: Used as the database schema context. 2) sql_prompt: The natural language question. 3) sql: The target SQL query. 4) sql_explanation: The explanation of the query logic. ## Limitations 1) Training Steps: This model was trained for a limited number of steps (60) as a proof of concept. It may not generalize well to extremely complex or unseen database schemas. 2) Hallucination: Like all LLMs, it may generate syntactically correct but logically incorrect SQL. Always validate the output before running it on a production database. 3) Scope: It is optimized for standard SQL (similar to SQLite/PostgreSQL) as presented in the GretelAI dataset. ## License This model is derived from Llama-3.2 and is subject to the Llama 3.2 Community License.