结构化输出 Structured Output
作用:将大模型返回的非结构化数据转换为应用程序需要的结构化数据。
结构化输出转换器核心做两件事:
- 定义 format:在向大模型发起请求时,将 format 拼接在提示语之后一起发送给大模型,指导大模型按照该 format 进行返回
- 将字符串转换为目标格式:在大模型返回数据后,将返回的字符串内容转化为指定格式,例如 Java 类
转化为 Java 类型
java
/**
* 返回 Java 类
*/
@RequestMapping("/5")
public ActorFilms execute5() {
return chatClient.prompt("Generate the filmography for a random actor.").call().entity(ActorFilms.class);
}
Java 类如下:
java
import lombok.Data;
import lombok.experimental.Accessors;
import java.util.List;
@Data
@Accessors(chain = true)
public class ActorFilms {
private String actor;
private List<String> films;
}
其底层实现是:
java
@RequestMapping("/55")
public ActorFilms execute55() {
/*
* 创建转换器(转换器的 format 提示语会根据 BeanOutputConverter 中的 clazz 属性进行编写)
*/
BeanOutputConverter<ActorFilms> converter = new BeanOutputConverter<>(ActorFilms.class);
/*
* 将转换器的 format 赋值
*/
PromptTemplate promptTemplate = new PromptTemplate("""
Generate the filmography for a random actor.
{format}""");
Prompt prompt = promptTemplate.create(Map.of("format", converter.getFormat()));
/*
* 调用大模型
*/
String content = chatClient.prompt(prompt).advisors(new SimpleLoggerAdvisor()).call().content();
/*
* 转换为指定类型
*/
return converter.convert(content);
}
类型转换器会根据返回的不同的 Java 模型制定不同的 format 提示语。例如对于返回 ActorFilms
,format 提示语如下(注意观察 schema 部分):
text
Your response should be in JSON format.
Do not include any explanations, only provide a RFC8259 compliant JSON response following this format without deviation.
Do not include markdown code blocks in your response.
Remove the ```json markdown from the output.
Here is the JSON Schema instance your output must adhere to:
```{
"$schema" : "https://json-schema.org/draft/2020-12/schema",
"type" : "object",
"properties" : {
"actor" : {
"type" : "string"
},
"films" : {
"type" : "array",
"items" : {
"type" : "string"
}
}
},
"additionalProperties" : false
}```
转换为 List 类型
java
@RequestMapping("/6")
public List<ActorFilms> execute6() {
return chatClient.prompt("Generate the filmography of 5 movies for 周星驰 and 刘德华.").call().entity(new ParameterizedTypeReference<>() {
});
}
其底层实现:
java
@RequestMapping("/56")
public List<ActorFilms> execute56() {
/*
* 创建转换器(转换器的 format 提示语会根据 BeanOutputConverter 中的 clazz 属性进行编写)
*/
BeanOutputConverter<List<ActorFilms>> converter = new BeanOutputConverter<>(new ParameterizedTypeReference<>() {
});
/*
* 将转换器的 format 赋值
*/
PromptTemplate promptTemplate = new PromptTemplate("Tell me the names of 5 movies those act by {actor}.{format}");
Prompt prompt = promptTemplate.create(Map.of("actor", "刘德华", "format", converter.getFormat()));
/*
* 调用大模型
*/
String content = chatClient.prompt(prompt).advisors(new SimpleLoggerAdvisor()).call().content();
/*
* 转换为指定类型
*/
return converter.convert(content);
}
在实际使用中,使用以上的转换器也不会 100% 转换成功,与模型选型有非常大的关系。例如,对于深度思考型模型(例如,qwen3:32b
和 deepseek-r1
)在 BeanOutputConverter
format 提示语的加持下,仍会返回思考块,返回格式类似如下:
text
<think>
xxx
</think>
json 结果串
对于这种情况我们需要自定义转换器来去除思考块。
如果返回结构体不是 JSON 串,可以使用 JSON-Repair 做 JSON 字符串修复。
自定义类型转换器(思考模式模型)
java
import java.lang.reflect.Type;
import java.util.Objects;
import java.util.regex.Pattern;
import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.core.util.DefaultIndenter;
import com.fasterxml.jackson.core.util.DefaultPrettyPrinter;
import com.fasterxml.jackson.databind.DeserializationFeature;
import com.fasterxml.jackson.databind.JsonNode;
import com.fasterxml.jackson.databind.ObjectMapper;
import com.fasterxml.jackson.databind.ObjectWriter;
import com.fasterxml.jackson.databind.json.JsonMapper;
import com.github.victools.jsonschema.generator.Option;
import com.github.victools.jsonschema.generator.SchemaGenerator;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfig;
import com.github.victools.jsonschema.generator.SchemaGeneratorConfigBuilder;
import com.github.victools.jsonschema.module.jackson.JacksonModule;
import com.github.victools.jsonschema.module.jackson.JacksonOption;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import org.springframework.ai.converter.StructuredOutputConverter;
import org.springframework.ai.model.KotlinModule;
import org.springframework.ai.util.JacksonUtils;
import org.springframework.core.KotlinDetector;
import org.springframework.core.ParameterizedTypeReference;
import org.springframework.lang.NonNull;
import static org.springframework.ai.util.LoggingMarkers.SENSITIVE_DATA_MARKER;
/**
* 思考模式模型输出转换器
* 作用:用于去除返回的 <think> 块中的思考块
*/
public class ThinkModelBeanOutputConverter<T> implements StructuredOutputConverter<T> {
private final Logger logger = LoggerFactory.getLogger(ThinkModelBeanOutputConverter.class);
private final Pattern pattern = Pattern.compile("<think>.*?</think>", Pattern.DOTALL);
private final Type type;
private final ObjectMapper objectMapper;
private String jsonSchema;
public ThinkModelBeanOutputConverter(Class<T> clazz) {
this(ParameterizedTypeReference.forType(clazz));
}
public ThinkModelBeanOutputConverter(ParameterizedTypeReference<T> typeRef) {
this(typeRef.getType(), null);
}
public ThinkModelBeanOutputConverter(Class<T> clazz, ObjectMapper objectMapper) {
this(ParameterizedTypeReference.forType(clazz), objectMapper);
}
public ThinkModelBeanOutputConverter(ParameterizedTypeReference<T> typeRef, ObjectMapper objectMapper) {
this(typeRef.getType(), objectMapper);
}
private ThinkModelBeanOutputConverter(Type type, ObjectMapper objectMapper) {
Objects.requireNonNull(type, "Type cannot be null;");
this.type = type;
this.objectMapper = objectMapper != null ? objectMapper : getObjectMapper();
generateSchema();
}
private void generateSchema() {
JacksonModule jacksonModule = new JacksonModule(JacksonOption.RESPECT_JSONPROPERTY_REQUIRED,
JacksonOption.RESPECT_JSONPROPERTY_ORDER);
SchemaGeneratorConfigBuilder configBuilder = new SchemaGeneratorConfigBuilder(
com.github.victools.jsonschema.generator.SchemaVersion.DRAFT_2020_12,
com.github.victools.jsonschema.generator.OptionPreset.PLAIN_JSON)
.with(jacksonModule)
.with(Option.FORBIDDEN_ADDITIONAL_PROPERTIES_BY_DEFAULT);
if (KotlinDetector.isKotlinReflectPresent()) {
configBuilder.with(new KotlinModule());
}
SchemaGeneratorConfig config = configBuilder.build();
SchemaGenerator generator = new SchemaGenerator(config);
JsonNode jsonNode = generator.generateSchema(this.type);
ObjectWriter objectWriter = this.objectMapper.writer(new DefaultPrettyPrinter()
.withObjectIndenter(new DefaultIndenter().withLinefeed(System.lineSeparator())));
try {
this.jsonSchema = objectWriter.writeValueAsString(jsonNode);
}
catch (JsonProcessingException e) {
logger.error("Could not pretty print json schema for jsonNode: {}", jsonNode);
throw new RuntimeException("Could not pretty print json schema for " + this.type, e);
}
}
@SuppressWarnings("unchecked")
@Override
public T convert(@NonNull String text) {
try {
// remove <think></think> block
text = cleanThinkTags(text).trim();
// Check for and remove triple backticks and "json" identifier
if (text.startsWith("```") && text.endsWith("```")) {
// Remove the first line if it contains "```json"
String[] lines = text.split("\n", 2);
if (lines[0].trim().equalsIgnoreCase("```json")) {
text = lines.length > 1 ? lines[1] : "";
}
else {
text = text.substring(3); // Remove leading ```
}
// Remove trailing ```
text = text.substring(0, text.length() - 3);
// Trim again to remove any potential whitespace
text = text.trim();
}
return (T) this.objectMapper.readValue(text, this.objectMapper.constructType(this.type));
}
catch (JsonProcessingException e) {
logger.error(SENSITIVE_DATA_MARKER,
"Could not parse the given text to the desired target type: \"{}\" into {}", text, this.type);
throw new RuntimeException(e);
}
}
private String cleanThinkTags(String response) {
return pattern.matcher(response).replaceAll("");
}
protected ObjectMapper getObjectMapper() {
return JsonMapper.builder()
.addModules(JacksonUtils.instantiateAvailableModules())
.configure(DeserializationFeature.FAIL_ON_UNKNOWN_PROPERTIES, false)
.build();
}
@Override
public String getFormat() {
String template = """
Your response should be in JSON format.
Do not include any explanations, only provide a RFC8259 compliant JSON response following this format without deviation.
Do not include markdown code blocks in your response.
Remove the ```json markdown from the output.
Here is the JSON Schema instance your output must adhere to:
```%s```
""";
return String.format(template, this.jsonSchema);
}
}
说明:使用正则匹配出 <think>xxx</think>
,删除该返回。
文章的最后,如果您觉得本文对您有用,请打赏一杯咖啡!感谢!
