There’s been a lot of confusion around how Inference Providers are supposed to be used:
I don’t think “You have been granted access to this model” necessarily contradicts “Model not supported by provider featherless-ai”.
The short version is:
| Check | What it means |
|---|---|
| “You have been granted access” on the model page | You have access to the gated model repo / weights / model page resources. |
| The browser widget works | Some provider/path available from the widget could run something for that page. It does not necessarily prove your third-party app is using the same provider, model id, task, token scope, or endpoint. |
Model not supported by provider featherless-ai |
|
The selected provider, here featherless-ai , may not currently expose the exact model id and task that your API call is asking for. |
So I would first check the exact model id + provider + task combination before debugging the token or curl syntax too much.
The quickest first check is the model search page with the Inference Providers filter:
https://huggingface.co/models?inference_provider=all
Then search for the exact model id and, if needed, narrow the provider filter to Featherless. If the exact model/provider combination is not listed there, changing the curl call probably will not make that provider serve the model.
Also, one subtle point: meta-llama/Llama-3.1-70B
and meta-llama/Llama-3.1-70B-Instruct
are not interchangeable.
meta-llama/Llama-3.1-70B
meta-llama/Llama-3.1-70B-Instruct
If your third-party app is making chat-completion-style calls, I would first verify whether the Instruct variant is available through the provider you are trying to use, rather than assuming that access to the base repo means the provider can serve it through chat completions.
A practical order of checks would be:
Confirm the exact model id:
meta-llama/Llama-3.1-70B
meta-llama/Llama-3.1-70B-Instruct
meta-llama/Meta-Llama-3.1-70B-Instruct
Check whether that exact model is currently exposed through Inference Providers:
If you are explicitly forcing Featherless, try not forcing it:
provider="auto"
in huggingface_hub
, or:featherless-ai
suffix if you are using the OpenAI-compatible router model name.If it works with auto
but fails with featherless-ai
, that suggests a provider-specific availability/mapping issue, not a general Llama access issue.
Check the local client version if you are using Python:
python -c "import huggingface_hub; print(huggingface_hub.__version__)"
Featherless’ HF integration post says to use huggingface_hub
v0.33.0 or newer:
If you still get the error, the useful info to post back would be:
provider="featherless-ai"
, :featherless-ai
, or auto
huggingface_hub
version, if applicableTo summarize my guess: this is probably not “you do not have access to Llama” in the simple gated-repo sense. It is more likely one of these:
featherless-ai
,The first thing I would rule out is the non-fixable case: is the exact model id currently available through the provider you are forcing?