Hi,
We’re running Vault Professional 10.0.2 and have an older codebase that was originally stored using ISO‑8859‑1 (Latin‑1) encoding. We want to migrate everything to UTF‑8 going forward.
When trying to install Vault using a UTF‑8 SQL Server collation, the installer fails because Vault still uses NTEXT columns, and SQL Server doesn’t allow _UTF8 collations with legacy LOB types. So it seems Vault cannot run on a UTF‑8 SQL collation today.
Questions:
Is there any supported way to run Vault 10.0.2 with a UTF‑8 SQL Server collation?
What is the recommended method to migrate an existing ISO‑8859‑1 repository to UTF‑8?
Export/import?
External file conversion?
Is Vault fully Unicode internally even when SQL collation is not UTF‑8?
Are there plans to replace NTEXT/TEXT columns so that Vault can support UTF‑8 SQL collations in the future?
Our goal is to ensure all future commits are UTF‑8 and avoid encoding issues.
Thanks!
How can we move our Vault 10.0.2 repository from ISO‑8859‑1 to UTF‑8?
Re: How can we move our Vault 10.0.2 repository from ISO‑8859‑1 to UTF‑8?
Hello,
I have addressed your concerns below:
* Is there any supported way to run Vault 10.0.2 with a UTF‑8 SQL Server collation?
For existing databases, no. Once a database is created the collation is set and we don't have any tools available to convert.
* What is the recommended method to migrate an existing ISO‑8859‑1 repository to UTF‑8?
While Vault itself has no problems storing any type of files (since it is stored as raw bytes), the collation list will be limited by collations that still support IMAGE / NTEXT columns.
* Is Vault fully Unicode internally even when SQL collation is not UTF‑8?
Yes. UTF-16, UTF-8, DBCS, etc. are all fine for your files. Vault works with files regardless of encoding and stores changes to file data as raw bytes. Encoding is irrelevant regardless of the collation due to how these are stored in Vault related databases.
* Are there plans to replace NTEXT/TEXT columns so that Vault can support UTF‑8 SQL collations in the future?We have our first feature request
This is our first feature request and it has been logged to be considered for upcoming releases.
Thanks,
Tonya
I have addressed your concerns below:
* Is there any supported way to run Vault 10.0.2 with a UTF‑8 SQL Server collation?
For existing databases, no. Once a database is created the collation is set and we don't have any tools available to convert.
* What is the recommended method to migrate an existing ISO‑8859‑1 repository to UTF‑8?
While Vault itself has no problems storing any type of files (since it is stored as raw bytes), the collation list will be limited by collations that still support IMAGE / NTEXT columns.
* Is Vault fully Unicode internally even when SQL collation is not UTF‑8?
Yes. UTF-16, UTF-8, DBCS, etc. are all fine for your files. Vault works with files regardless of encoding and stores changes to file data as raw bytes. Encoding is irrelevant regardless of the collation due to how these are stored in Vault related databases.
* Are there plans to replace NTEXT/TEXT columns so that Vault can support UTF‑8 SQL collations in the future?We have our first feature request
This is our first feature request and it has been logged to be considered for upcoming releases.
Thanks,
Tonya
Re: How can we move our Vault 10.0.2 repository from ISO‑8859‑1 to UTF‑8?
Thanks Tonya, that clarifies a lot.
So just to summarize my understanding:
Vault 10.0.2 cannot run on a SQL UTF‑8 collation, because the schema still uses NTEXT/IMAGE column types, and SQL Server blocks UTF‑8 collations on those types.
Changing an existing Vault database to UTF‑8 is not supported, and there is no migration tool for that today.
Vault already stores file contents as raw bytes, so encoding (ISO‑8859‑1 → UTF‑8, etc.) does not affect how Vault stores or retrieves file data.
This means I can simply convert my working files to UTF‑8 in the client environment, and Vault will store them correctly regardless of database collation.
Thanks for the clarification — this answers my questions and gives me a path forward.
So just to summarize my understanding:
Vault 10.0.2 cannot run on a SQL UTF‑8 collation, because the schema still uses NTEXT/IMAGE column types, and SQL Server blocks UTF‑8 collations on those types.
Changing an existing Vault database to UTF‑8 is not supported, and there is no migration tool for that today.
Vault already stores file contents as raw bytes, so encoding (ISO‑8859‑1 → UTF‑8, etc.) does not affect how Vault stores or retrieves file data.
This means I can simply convert my working files to UTF‑8 in the client environment, and Vault will store them correctly regardless of database collation.
Thanks for the clarification — this answers my questions and gives me a path forward.
Re: How can we move our Vault 10.0.2 repository from ISO‑8859‑1 to UTF‑8?
Glad the information was helpful.
Tonya
Tonya