Learning quantum properties from short-range correlations using multi-task networks

feature-image

Play all audios:

Loading...

ABSTRACT Characterizing multipartite quantum systems is crucial for quantum computing and many-body physics. The problem, however, becomes challenging when the system size is large and the


properties of interest involve correlations among a large number of particles. Here we introduce a neural network model that can predict various quantum properties of many-body quantum


states with constant correlation length, using only measurement data from a small number of neighboring sites. The model is based on the technique of multi-task learning, which we show to


offer several advantages over traditional single-task approaches. Through numerical experiments, we show that multi-task learning can be applied to sufficiently regular states to predict


global properties, like string order parameters, from the observation of short-range correlations, and to distinguish between quantum phases that cannot be distinguished by single-task


networks. Remarkably, our model appears to be able to transfer information learnt from lower dimensional quantum systems to higher dimensional ones, and to make accurate predictions for


Hamiltonians that were not seen in the training. SIMILAR CONTENT BEING VIEWED BY OTHERS STOCHASTIC REPRESENTATION OF MANY-BODY QUANTUM STATES Article Open access 16 June 2023 EMPOWERING DEEP


NEURAL QUANTUM STATES THROUGH EFFICIENT OPTIMIZATION Article Open access 01 July 2024 DIRECT ENTANGLEMENT DETECTION OF QUANTUM SYSTEMS USING MACHINE LEARNING Article Open access 20 February


2025 INTRODUCTION The experimental characterization of many-body quantum states is an essential task in quantum information and computation. Neural networks provide a powerful approach to


quantum state characterization1,2,3,4, enabling a compact representation of sufficiently structured quantum states5. In recent years, different types of neural networks have been


successfully utilized to predict properties of quantum systems, including quantum fidelity6,7,8 and other measures of similarity9,10, quantum entanglement11,12,13, entanglement


entropy1,14,15, two-point correlations1,2,14 and Pauli expectation values4,16, as well as to identify phases of matter17,18,19,20,21. A challenge in characterizing multiparticle quantum


systems is that the number of measurement settings rapidly increases with the system size. Randomized measurement techniques22,23,24,25,26,27,28,29,30 provide an effective way to predict the


properties of generic quantum states by using a reduced number of measurement settings, randomly sampled from the set of products of single particle observables. In the special case of


many-body quantum systems subject to local interactions, however, sampling from an even smaller set of measurements may be possible, due to the additional structure of the states under


consideration, which may enable a characterization of the state based only on short-range correlations, that is, correlations involving only a small number of neighboring sites. The use of


short-range correlations has been investigated for the purpose of quantum state tomography31,32,33,34,35 and entanglement detection36,37. A promising approach is to employ neural networks to


predict global quantum properties directly from data obtained by sampling over a set of short-range correlations. In this paper, we develop a neural network model for predicting various


properties of quantum many-body states from short-range correlations. Our model utilizes the technique of multi-task learning38 to generate concise state representations that integrate


diverse types of information. In particular, the model can integrate information obtained from few-body measurements into a representation of the overall quantum state, in a way that is


reminiscent of the quantum marginal problem39,40,41. The state representations produced by our model are then used to learn new physical properties that were not seen during the training,


including global properties such as string order parameters and many-body topological invariants42. For ground states with short-range correlations, we find that our model accurately


predicts nonlocal features using only measurements on a few nearby particles. With respect to traditional, single-task neural networks, our model achieves more precise predictions with


comparable amounts of input data and enables a direct, unsupervised classification of symmetry-protected topological (SPT) phases that could not be distinguished in the single-task approach.


In addition, we find that, after the training is completed, the model can be applied to quantum states and Hamiltonians outside the original training set, and even to quantum systems of


higher dimension. This strong performance on out-of-distribution states suggests that the multi-task approach is a promising tool for exploring the next frontier of intermediate-scale


quantum systems. RESULTS MULTI-TASK FRAMEWORK FOR QUANTUM PROPERTIES Consider the scenario where an experimenter has access to multiple copies of an unknown quantum state _ρ__θ_,


characterized by some physical parameters _θ_. For example, _ρ__θ_ could be a ground state of many-body local Hamiltonian depending on _θ_. The experimenter’s goal is to predict a set of


properties of the quantum state, such as the expectation values of some observables, or some nonlinear functions, such as the von Neumann entropy. The experimenter is able to perform a


restricted set of quantum measurements, denoted by \({{\mathcal{M}}}\). Each measurement \({{\bf{M}}}\in {{\mathcal{M}}}\) is described by a positive operator-valued measure (POVM) M = 


(_M__j_), where the index _j_ labels the measurement outcome, each _M__j_ is a positive operator acting on the system’s Hilbert space, and the normalization condition ∑_j__M__j_  = _I_ is


satisfied. In general, the measurement set \({{\mathcal{M}}}\) may not be informationally complete. For multipartite systems, we will typically take \({{\mathcal{M}}}\) to consist of


short-range measurements, that is, local measurements performed on a small number of neighboring systems, although this choice is not a necessary part of our multi-task learning framework.


It is also worth noting that choosing short-range measurements for the set \({{\mathcal{M}}}\) does not necessarily mean that the experimenter has to physically isolate a subset of


neighboring systems before doing their measurements. The access to short-range measurement statistics can obtained, _e.g_. from product measurements performed jointly on all systems, by


discarding the outcomes generated from systems outside the subset of interest. In this way, a single product measurement performed jointly on all systems can provide data to multiple


short-range measurements. To collect data, the experimenter randomly picks a subset of measurements \({{\mathcal{S}}}\subset {{\mathcal{M}}}\), and performs them on different copies of the


state _ρ__θ_. We will denote by _s_ the number of measurements in \({{\mathcal{S}}}\), and by \({{{\bf{M}}}}_{i}:\!\!=\left({M}_{ij}\right)\) the _i_-th POVM in \({{\mathcal{S}}}\). For


simplicity, if not specified otherwise, we assume that each measurement in \({{\mathcal{S}}}\) is repeated sufficiently many times, so that the experimenter can reliably estimate the outcome


distribution \({{{\bf{d}}}}_{i}:\!\!=({d}_{ij})\), where \({d}_{ij}:\!\!={{\rm{tr}}}(\rho {M}_{ij})\). The experimenter’s goal is to predict multiple quantum properties of _ρ__θ_ using the


outcome distributions \({({{{\bf{d}}}}_{i})}_{i=1}^{s}\). This task is achieved by a neural network that consists of an encoder and multiple decoders, where the encoder \({{\mathcal{E}}}\)


produces a representation of quantum states and the _k_-th decoder \({{{\mathcal{D}}}}_{k}\) produces a prediction of the _k_-th property of interest. Due to their roles, the encoder and


decoders are also known as representation and prediction networks, respectively. The input of the representation network \({{\mathcal{E}}}\) is the outcome distribution D_i_, together with a


parametrization of the corresponding measurement M_i_, hereafter denoted by M_i_. From the pair of data (D_i_, M_i_), the network produces a state representation


\({{{\bf{r}}}}_{i}:\!\!={{\mathcal{E}}}({{{\bf{d}}}}_{i},{{{\bf{m}}}}_{i})\). To combine the state representations arising from different measurements in \({{\mathcal{S}}}\), the network


computes the average \({{\bf{r}}}:\!\!=\frac{1}{s}{\sum }_{i=1}^{s}{{{\bf{r}}}}_{i}\). At this point, the vector R can be viewed as a representation of the unknown quantum state _ρ_. Each


prediction network \({{{\mathcal{D}}}}_{k}\) is dedicated to a different property of the quantum state. In the case of multipartite quantum systems, we include the option of evaluating the


property on a subsystem, specified by a parameter _q_. We denote by _f__k_,_q_(_ρ__θ_) the correct value of the _k_-th property of subsystem _q_ when the total system is in the state _ρ__θ_.


Upon receiving the state representation R and the subsystem specification _q_, the prediction network produces an estimate \({{{\mathcal{D}}}}_{k}({{\bf{r}}},q)\) of the value


_f__k_,_q_(_ρ_). The representation network and all the prediction networks are trained jointly, with the goal of minimizing the total prediction error on a set of fiducial states. The


fiducial states are chosen by randomly sampling a set of physical parameters \({({\theta }_{l})}_{l=1}^{L}\). For each fiducial state \({\rho }_{{\theta }_{{{\bf{l}}}}}\), we independently


sample a set of measurements \({{{\mathcal{S}}}}_{l}\) and calculate the outcome distributions for each measurement in the set \({{{\mathcal{S}}}}_{l}\). We randomly choose a subset of


properties \({{{\mathcal{K}}}}_{l}\) for each \({\rho }_{{\theta }_{{{\bf{l}}}}}\), where each property \(k\in {{{\mathcal{K}}}}_{l}\) corresponds to a set of subsystems


\({{{\mathcal{Q}}}}_{k}\), and then calculate the correct values of the quantum properties \(\{{f}_{k,q}({\rho }_{{\theta }_{l}})\}\) for all properties \(k\in {{{\mathcal{K}}}}_{l}\)


associated with subsystems \(q\in {{{\mathcal{Q}}}}_{k}\). The training data may be either classically simulated or gathered by actual measurements on the set of fiducial states, or it could


also be obtained by any combination of these two approaches. During the training, we do not provide the model with any information about the physical parameters _θ__l_ or about the


functions _f__k_,_q_. Instead, the internal parameters of the neural networks are jointly optimized in order to minimize the prediction errors \(| {{{\mathcal{D}}}}_{k}\left(1/s\mathop{\sum


}_{i=1}^{s}{{\mathcal{E}}}(\{{{{\bf{d}}}}_{i},\,{{{\bf{m}}}}_{i}\}),q\right)-{f}_{k,q}({\rho }_{\theta })| \) summed over all the fiducial states, all chosen properties, and all chosen


subsystems. After the training is concluded, our model can be used for predicting quantum properties, either within the set of properties seen during training or outside this set. The


requested properties are predicted on a new, unknown state _ρ__θ_, and even out-of-distribution state _ρ_ that has a structural similarity with the states in the original distribution, e.g.,


a ground state of the same type of Hamiltonian, but for a quantum system with a larger number of particles. The high-level structure of our model is illustrated in Fig. 1, while the details


of the neural networks are presented in Methods. LEARNING GROUND STATES OF CLUSTER-ISING MODEL We first test the performance of our model on a relatively small system of _N_ = 9 qubits


whose properties can be explicitly calculated. For the state family, we take the ground states of one-dimensional cluster-Ising model43 $${H}_{{{\rm{cI}}}}=-\mathop{\sum }_{i=1}^{N-2}{\sigma


}_{i}^{z}{\sigma }_{i+1}^{x}{\sigma }_{i+2}^{z}-{h}_{1}{\sum }_{i=1}^{N}{\sigma }_{i}^{x}-{h}_{2}{\sum }_{i=1}^{N-1}{\sigma }_{i}^{x}{\sigma }_{i+1}^{x}.$$ (1) The ground state falls in one


of three phases, depending on the values of the parameters (_h_1, _h_2). The three phases are: the SPT phase, the paramagnetic phase, and the antiferromagnetic phase. SPT phase can be


distinguished from two other phases by measuring the string order parameter44,45\(\langle \tilde{S}\rangle :\!\!=\langle {\sigma }_{1}^{z}{\sigma }_{2}^{x}{\sigma }_{4}^{x}\ldots {\sigma


}_{N-3}^{x}{\sigma }_{N-1}^{x}{\sigma }_{N}^{z}\rangle \), which is a global property involving (_N_ + 3)/2 qubits. We test our network model on the ground states corresponding to a 64 × 64


square grid in the parameter region (_h_1, _h_2) ∈ [0, 1.6] × [ − 1.6, 1.6]. For the set of accessible measurements \({{\mathcal{M}}}\), we take all possible three-nearest-neighbor Pauli


measurements, corresponding to the observables \({\sigma }_{i}^{\alpha }{\sigma }_{i+1}^{\beta }{\sigma }_{i+2}^{\gamma }\), where _i_ ∈ {1, 2, … , _N_ − 2} and _α_, _β_, _γ_ ∈ {_x_, _y_, 


_z_}. It is worth noting that, when two measurements \({{\bf{M}}}\,\ne\, {{{\bf{M}}}}^{{\prime} }\) act on disjoint qubit triplets or coincide at overlapping qubits, these measurements can


be performed simultaneously on a single copy of the state, thereby reducing the number of data collection rounds. In general, increasing the range of the correlations among Pauli


measurements can increase the performance of the network. For example, using the correlations from Pauli measurements on triplets of neighboring qubits (as described above) leads to a better


performance than using correlations from Pauli measurements on pairs of neighboring qubits, as illustrated in Supplementary Note 5. On the other hand, increasing the range of the


correlations also increases the size of the input to the neural network, making the training more computationally expensive. For the prediction tasks, we consider two properties: (A1) the


two-point correlation function \({{{\mathcal{C}}}}_{1j}^{\alpha }:\!\!={\langle {\sigma }_{1}^{\alpha }{\sigma }_{j}^{\alpha }\rangle }_{\rho }\), where 1 < _j_ ≤ _N_ and _α_ = _x_, _z_;


(A2) the Rényi entanglement entropy of order two \({S}_{A}:\!\!=-{\log }_{2}\left({{\rm{tr}}}{\rho }_{A}^{2}\right)\) for subsystem _A_ = [1, 2, … , _i_], where 1 ≤ _i_ < _N_. Both


properties (A1) and (A2) can be either numerically evaluated, or experimentally estimated by preparing the appropriate quantum state and performing randomized measurements27. We train our


neural network with respect to the fiducial ground states corresponding to 300 randomly chosen points from our 4096-element grid. For each fiducial state, we provide the neural network with


the outcome distributions of _s_ = 50 measurements, randomly chosen from the 243 measurements in \({{\mathcal{M}}}\). Half of these fiducial states randomly chosen from the whole set are


labeled by the values of property (A1) and the other half are labeled by property (A2). After training is concluded, we apply our trained model to predicting properties (A1)-(A2) for all


remaining ground states corresponding to points on the grid. For each test state, the representation network is provided with the outcome distributions on _s_ = 50 measurement settings


randomly chosen from \({{\mathcal{M}}}\). Figure 2a illustrates the coefficient of determination (_R_2), averaged over all test states, for each type of property. Notably, all the values of


_R_2 observed in our experiments are above 0.95. Our network makes accurate predictions even near the boundary between the SPT phase and paramagnetic phase, in spite of the fact that phase


transitions typically make it more difficult to capture the ground state properties from limited measurement data. For a ground state close to the boundary, marked by a star in the phase


diagram (Fig. 3d), the predictions of the entanglement entropy \({{{\mathcal{S}}}}_{A}\) and spin correlation \({{{\mathcal{C}}}}_{1j}^{z}\) are close to the corresponding ground truths, as


shown in Fig. 2d and e, respectively. In general, the accuracy of the predictions depends on the number of samplings for each measurement as well as the number of measurement settings. For


our experiments, the dependence is illustrated in Fig. 2b and c. To examine whether our multi-task neural network model enhances the prediction accuracy compared to single-task networks, we


perform ablation experiments46. We train three individual single-task neural networks as our baseline models, each of which predicts spin correlations in Pauli-x axis, spin correlations in


Pauli-z axis, and entanglement entropies, respectively. For each single-task neural network, the training provides the network with the corresponding properties for the 300 fiducial ground


states, without providing any information about the other properties. After the training is concluded, we apply each single-task neural network to predict the corresponding properties on all


the test states and use their predictions as baselines to benchmark the performance of our multi-task neural network. Figure 2a compares the values of _R_2 for the predictions of our


multi-task neural model with those of the single-task counterparts. The results demonstrate that learning multiple physical properties simultaneously enhances the prediction of each


individual property. TRANSFER LEARNING TO NEW TASKS We now show that the state representations produced by the encoder can be used to perform new tasks that were not encountered during the


training phase. In particular, we show that state representations can be used to distinguish between the phases of matter associated to different values of the Hamiltonian parameters in an


unsupervised manner. To this purpose, we project the representations of all the test states onto a two-dimensional (2D) plane using the t-distributed stochastic neighbor embedding (t-SNE)


algorithm. The results are shown in Fig. 3a. Every data point shows the exact value of the string order parameter, which distinguishes between the SPT phase and the other two phases. Quite


strikingly, we find that the disposition of the points in the 2D representation matches the values of the string order parameter, even though no information about the string order parameters


was provided during the training, and even though the string order is a global property, while the measurement data provided to the network came from a small number of neighboring sites. A


natural question is whether the accurate classification of phases of matter observed above is a consequence of the multi-task nature of our model. To shed light into this question, we


compare the results of our multi-task network with those of single-task neural networks, feeding the state representations generated by these networks into the t-SNE algorithm to produce a


2D representation. The pattern of the projected state representations in Fig. 3b indicates that when trained only with the values of entanglement entropies, the neural network cannot


distinguish between the paramagnetic phase and the antiferromagnetic phase. Interestingly, a single-task network trained only on the spin correlations can still distinguish the SPT phase


from the other two phases, as shown in Fig. 3c. However, in the next section we see that applying random local gates induces errors in the single-task network, while the multi-task network


still achieves a correct classification of the different phases. Quantitatively, the values of the string order parameter can be extracted from the state representations using another neural


network \({{\mathcal{N}}}\). To train this network, we randomly pick 100 reference states {_σ__i_} out of the 300 fiducial states and minimize the error \({\sum }_{i=1}^{100}|


{{\mathcal{N}}}({{{\bf{r}}}}_{{\sigma }_{i}})-{\langle \tilde{S}\rangle }_{{\sigma }_{i}}| \). Then, we use the trained neural network \({{\mathcal{N}}}\) to produce the prediction


\({{\mathcal{N}}}({{{\bf{r}}}}_{\rho })\) of \({\langle \tilde{S}\rangle }_{\rho }\) for every other state _ρ_. The prediction for each ground state is shown in the phase diagram (Fig. 3d),


where the 100 reference states are marked by white circles. The predictions are close to the true values of string order parameters, with the coefficient of determination between the


predictions and the ground truth being 0.97. It is important to stress that, while the network \({{\mathcal{N}}}\) was trained on values of the string order parameter, the representation


network \({{\mathcal{E}}}\) was not provided any information about this parameter. Note also that the values of the Hamiltonian parameters (_h_1, _h_2) are just provided in the figure for


the purpose of visualization: in fact, no information about the Hamiltonian parameters was provided to the network during training or test. In Supplementary Note 5, we show that our neural


network model trained for predicting entanglement entropy and spin correlations can also be transferred to other ground-state properties of the cluster-Ising model. GENERALIZATION TO


OUT-OF-DISTRIBUTION STATES In the previous sections, we assumed that both the training and the testing states were randomly sampled from a set of ground states of the cluster-Ising model


(1). In this subsection, we explore how a model trained on a given set of quantum states can generalize to states outside the original set in an unsupervised or weakly supervised manner. Our


first finding is that our model, trained on the ground states of the cluster-Ising model, can effectively cluster general quantum states in the SPT phase and the trivial phase (respecting


the symmetry of bit flips at even/odd sites), without further training. Random quantum states in SPT (trivial) phase can be prepared by applying short-range symmetry-respecting local random


quantum gates on a cluster state in the SPT phase (a product state \({\left\vert+\right\rangle }^{\otimes N}\) in the paramagnetic phase). For these random quantum states, we follow the same


measurement strategy adopted before, feed the measurement data into our trained representation network, and use t-SNE to project the state representations onto a 2D plane. When the quantum


circuit consists of a layer of translation-invariant next-nearest neighbor symmetry-respecting random gates, our model successfully classifies the output states into their respective SPT


phase and trivial phase in both cases, as shown by Fig. 4a. In contrast, feeding the same measurement data into the representation network trained only on spin correlations fails to produce


two distinct clusters via t-SNE, as shown by Fig. 4b. While this neural network successfully classifies different phases for the cluster-Ising model, random local quantum gates confuse it.


This failure is consistent with the recent observation that extracting linear functions of a quantum state is insufficient for classifying arbitrary states within SPT phase and trivial


phase26. We then prepare more complex states by applying two layers of translation-invariant random gates consisting of both nearest neighbor and next-nearest neighbor gates preserving the


symmetry onto the initial states. The results in Fig. 4c show that the state representations of these two phases remain different, but the boundary between them in the representation space


is less clearly identified. Whereas, the neural network trained only on spin correlations fails to classify these two phases, as shown by Fig. 4d. Finally, we demonstrate that our neural


model, trained on the cluster-Ising model, can adapt to learn the ground states of a new, perturbed Hamiltonian47 $${H}_{{{\rm{pcI}}}}={H}_{{{\rm{cI}}}}+{h}_{3}\mathop{\sum


}_{i=1}^{N-1}{\sigma }_{i}^{z}{\sigma }_{i+1}^{z}.$$ (2) This perturbation breaks the original symmetry, shifts the boundary of the cluster phase, and introduces a new phase of matter. In


spite of these substantial changes, Fig. 5a shows that our model, trained on the unperturbed cluster-Ising model, successfully identifies the different phases, including the new phase from


the perturbation. Moreover, using just 10 randomly chosen additional reference states (marked by white circles in Fig. 5b), the original prediction network can be adjusted to predict the


values of \(\langle \tilde{{{\mathcal{S}}}}\rangle \) from state representations. As shown in Fig. 5b, the predicted values closely match the ground truths in Fig. 5c, achieving a


coefficient of determination of up to 0.956 between the predictions and the ground truths. LEARNING GROUND STATES OF XXZ MODEL We now apply our model to a larger quantum system, consisting


of 50 qubits in ground states of the bond-alternating XXZ model24 $$H=\, J{\sum }_{i=1}^{N/2}\left({\sigma }_{2i-1}^{x}{\sigma }_{2i}^{x}+{\sigma }_{2i-1}^{y}{\sigma }_{2i}^{y}+\delta


{\sigma }_{2i-1}^{z}{\sigma }_{2i}^{z}\right)\\ +\,{J}^{{\prime} }{\sum }_{i=1}^{N/2-1}\left({\sigma }_{2i}^{x}{\sigma }_{2i+1}^{x}+{\sigma }_{2i}^{y}{\sigma }_{2i+1}^{y}+\delta {\sigma


}_{2i}^{z}{\sigma }_{2i+1}^{z}\right),$$ (3) where _J_ and \({J}^{{\prime} }\) are the alternating values of the nearest-neighbor spin couplings. We consider a set of ground states


corresponding to a 21 × 21 square grid in the parameter region \((J/{J}^{{\prime} },\delta )\in (0,\,3)\times (0,\,4)\). Depending on the ratio of \(J/{J}^{{\prime} }\) and the strength of


_δ_, the corresponding ground state falls into one of three possible phases: trivial SPT phase, topological SPT phase, and symmetry broken phase. Unlike the SPT phases of cluster-Ising


model, the SPT phases of bond-alternating XXZ model cannot be detected by any string order parameter. Both SPT phases are protected by bond-center inversion symmetry, and detecting them


requires a many-body topological invariant, called the partial reflection topological invariant24 and denoted by $${{{\mathcal{Z}}}}_{{{\rm{R}}}}:\!\!=\frac{{{\rm{tr}}}({\rho


}_{I}{{{\mathcal{R}}}}_{I})}{\sqrt{\left[{{\rm{tr}}}\left({\rho }_{{I}_{1}}^{2}\right)+{{\rm{tr}}}\left({\rho }_{{I}_{2}}^{2}\right)\right]/2}}.$$ (4) Here, \({{{\mathcal{R}}}}_{I}\) is the


swap operation on subsystem _I_ := _I_1 ∪ _I_2 with respect to the center of the spin chain, and _I_1 = [_N_/2 − 5, _N_/2 − 4, … , _N_/2] and _I_2 = [_N_/2 + 1, _N_/2 + 2, … , _N_/2 + 6] are


two subsystems with six qubits. For the set of possible measurements \({{\mathcal{M}}}\), we take all possible three-nearest-neighbor Pauli projective measurements, as we did earlier in the


cluster-Ising model. For the prediction tasks, we consider two types of quantum properties: (B1) nearest-neighbor spin correlations \({{{\mathcal{C}}}}_{i,i+1}^{\beta }:\!\!=\langle {\sigma


}_{i}^{\beta }{\sigma }_{i+1}^{\beta }\rangle (1\le i\le N-1)\), where _β_ =  _x_, _z_; (B2) order-two Rényi mutual information _I__A_:_B_, where _A_ and _B_ are both 4-qubit subsystems:


either _A_1 = [22: 25], _B_1  =  [26: 29] or _A_2 = [21: 24], _B_2  =  [25: 28]. We train our neural network with respect to the fiducial ground states corresponding to 80 pairs of


\((J/{J}^{{\prime} },\delta )\), randomly sampled from the 441-element grid. For each fiducial state, we provide the neural network with the probability distributions corresponding to _s_ = 


200 measurements randomly chosen from the 1350 measurements in \({{\mathcal{M}}}\). Half of the fiducial states randomly chosen from the entire set are labeled by the property of (B1), while


the other half are labeled by the property of (B2). After the training is concluded, we use our trained model to predict both properties (B1) and (B2) for all the ground states in the grid.


Figure 6a demonstrates the strong predictive performance of our model, where the values of _R_2 are above 0.92 for all properties averaged over test states. We benchmark the performances of


our multi-task neural network with the predictions of single-task counterparts. Here each single-task neural network, the size of which is same as the multi-task network, aims at predicting


one single physical property and is trained using the same set of measurement data of 80 fiducial states together with one of their properties: \({C}_{i,i+1}^{x}\), \({C}_{i,i+1}^{z}\),


\({I}_{{A}_{1}:{B}_{1}}\) and \({I}_{{A}_{2}:{B}_{2}}\). Figure 6a compares the coefficients of determination for the predictions of both our multi-task neural network and the single-task


neural networks, where each experiment is repeated multiple times over different sets of _s_ = 200 measurements randomly chosen from \({{\mathcal{M}}}\). The results indicate that our


multi-task neural model not only achieves higher accuracy in the predictions of all properties, but also is much more robust to different choices of quantum measurements. As in the case of


the cluster-Ising model, we also study how the number of quantum measurements _s_ and the number of samplings for each quantum measurement affect the prediction accuracy of our neural


network model, as shown by Fig. 6b and c. Additionally, in Supplementary Note 6 we test how the size of the quantum system affects the prediction accuracy given the same amount of local


measurement data, and how the number of layers in the representation network affects the prediction accuracy. Interestingly, we observe that reducing the number of layers from four to two


results in a significant decline in performance when predicting properties (B1) and (B2). This observation indicates that the depth of the network plays an important role in achieving


effective multi-task learning of the bond-alternating XXZ ground states. We show that, even in the larger-scale example considered in this section, the state representations obtained through


multi-task training contain information about the quantum phases of matter. In Fig. 7a, we show the 2D-projection of the state representations. The data points corresponding to ground


states in the topological SPT phase, the trivial SPT phase and the symmetry broken phase appear to be clearly separated into three clusters, while the latter two connected by a few data


points corresponding to ground states across the phase boundary. A few points, corresponding to ground states near phase boundaries of the topological SPT phase, are incorrectly clustered by


the t-SNE algorithm. The origin of the problem is that the correlation length of ground states near phase boundary becomes longer, and therefore the measurement statistics on


nearest-neighbor-three qubit subsystems cannot capture sufficient information for predicting the correct phase of matter. We further examine if the single-task neural networks above can


correctly classify the three different phases of matter. We project the state representations produced by each single-task neural network onto 2D planes by the t-SNE algorithm, as shown by


Fig. 7b and c. The pattern of projected representations in Fig. 7b implies that when trained only with the values of spin correlations, the neural network cannot distinguish the topological


SPT phase from the trivial SPT phase. The pattern in Fig. 7c indicates that when trained solely with mutual information, the performance of clustering is slightly improved, but still cannot


explicitly classify these two SPT phases. We also project the state representations produced by the neural network for predicting measurement outcome statistics3 onto a 2D plane. The


resulting pattern, shown in Fig. 7d, shows that the topological SPT phase and the trivial SPT phase cannot be correctly classified either. These observations indicate that a multi-task


approach, including both the properties of mutual information and spin correlations, is necessary to capture the difference between the topological SPT phase and the trivial SPT phase. The


emergence of clusters related to different phases of matter suggests that the state representation produced by our network also contains quantitative information about the topological


invariant \({{{\mathcal{Z}}}}_{{{\rm{R}}}}\). To extract this information, we use an additional neural network, which maps the state representation into a prediction of


\({{{\mathcal{Z}}}}_{{{\rm{R}}}}\). We train this additional network by randomly selecting 60 reference states (marked by gray squares in Fig. 7e) out of the set of 441 fiducial states, and


by minimizing the prediction error on the reference states. The predictions together with 60 exact values of the reference states are shown in Fig. 7e The absolute values of the differences


between the predictions and ground truths are shown in Fig. 7f. The predictions are close to the ground truths, except for the ground states near the phase boundaries, especially the


boundary of topological SPT phase. The mismatch at the phase boundaries corresponds the state representations incorrectly clustered in Fig. 7a, suggesting our network struggles to learn


long-range correlations at phase boundaries. GENERALIZATION TO QUANTUM SYSTEMS OF LARGER SIZE We now show that our model is capable of extracting features that are transferable across


different system sizes. To this purpose, we use a training dataset generated from 10-qubit ground states of the bond-alternating XXZ model (3) and then we use the trained network to generate


state representations from the local measurement data of each 50-qubit ground state of (3). Note that, since we use measurements on subsystems of fixed size, the size of the input to our


neural network remains constant during both training and testing, independently of the total number of qubits in the system. During training on 10-qubit systems, the network is informed by


the index of the first qubit in each qubit triplet, which ranges from 0 to 7. For testing, this index ranges from 0 to 47. This index primarily labels the triplets without carrying specific


meaning. Numerical experiments below show that this index does not significantly affect the quality of predictions, likely due to the approximate translational symmetry. Alternatively,


one-hot encoding could specify qubit triplets, but this would complicate the neural network and introduce size dependence. Figure 8a shows that inputting the state representations into the


t-SNE algorithm still gives rise to clusters according to the three distinct phases of matter. This observation suggests that the neural network can effectively classify different phases of


the bond-alternating XXZ model, irrespective of the system size. In addition to clustering larger quantum states, the representation network also facilitates the prediction of quantum


properties in the larger system. To demonstrate this capability, we employ 40 reference ground states of the 50-qubit bond-alternating XXZ model, which are only half size of the training


dataset used for 10-qubit system, to train two prediction networks: one for spin correlations and the other for mutual information. Figure 8b shows the coefficients of determination for each


prediction, which exhibit values around 0.9 or above. Figure 8b also shows the impact of inaccurate labeling of the ground states on our model. In the reported experiments, we assumed that


10% of the labels in the training dataset corresponding to 40 reference states are randomly incorrect, while the remaining 90% are accurate. Without any mitigation, we observe that the error


substantially impacts the accuracy of our predictions. On the other hand, employing a technique of noise mitigation during the training of prediction networks (see Supplementary Note 6) can


effectively reduce the impact of the incorrect labels. MEASURING ALL QUBITS SIMULTANEOUSLY We now apply our multi-task network to a scenario where all qubits are measured simultaneously


with suitable product observables. This scenario is motivated by recent experiments on trapped-ion systems33,36,37. In these experiments, the qubits were divided into groups of equal size,


and the same product of Pauli observables was measured simultaneously in all groups. Here, we adopt the settings of33,36, where the groups consist of three neighboring qubits. We consider


the ground states of a 50-qubits XXZ model and take \({{\mathcal{M}}}\) to be the the set of all 27 measurements that measure the same three-qubit Pauli observable on each triplet. Compared


to the set of 350 products of Pauli observables on all qubits, this choice significantly reduces the number of measurement settings the experimenter has to sample from. As an example, we


choose \({{\mathcal{S}}}\subset {{\mathcal{M}}}\) as the set of three measurements corresponding to the cyclically permuted Pauli strings \({\sigma }_{1}^{x}{\sigma }_{2}^{y}{\sigma


}_{3}^{z}{\sigma }_{4}^{x}{\sigma }_{5}^{y}{\sigma }_{6}^{z}\cdots {\sigma }_{50}^{y}\), \({\sigma }_{1}^{y}{\sigma }_{2}^{z}{\sigma }_{3}^{x}{\sigma }_{4}^{y}{\sigma }_{5}^{z}{\sigma


}_{6}^{x}\cdots {\sigma }_{50}^{z}\), and \({\sigma }_{1}^{z}{\sigma }_{2}^{x}{\sigma }_{3}^{y}{\sigma }_{4}^{z}{\sigma }_{5}^{x}{\sigma }_{6}^{y}\cdots {\sigma }_{50}^{x}\). For each copy


of the quantum state, we randomly sample a measurement from \({{\mathcal{S}}}\) to apply to the state and perform a total of 300 measurements. We then use the marginal distributions of the


outcomes on every qubit triplet as the input to our representation network to produce state representations. Figure 9a shows the 2D projections of our data-driven state representations of 49


ground states of the bond-alternating XXZ model obtained using the t-SNE algorithm, where all three different phases are clearly classified. It is interesting to compare the performance of


our neural network algorithm with the approach of principal component analysis (PCA) with shadow kernel26. In the original classical shadow method, measurements are randomly chosen from the


set of all possible Pauli measurements. To make a fair comparison, here we assume that the set of measurements performed in the laboratory is \({{\mathcal{S}}}\), the set of measurements


used by our method. Figure 9b shows the 2D projections of the shadow representations of the same set of ground states obtained by kernel PCA. The results show that PCA with the shadow kernel


can hardly distinguish the topological SPT phase from the trivial SPT phase in this restricted measurement setting. In contrast, our multi-task learning network appears to achieve a good


performance in distinguishing the different phases. DISCUSSION The use of short-range local measurements is a key distinction between our work and prior approaches approaches using


randomized measurements22,23,24,25,26,27. Rather than performing randomized measurements over all spins together, we employ only randomized Pauli measurements detecting short-range


correlations. This feature is appealing for practical applications, as measuring only short-range correlations can significantly reduce the number of measurement settings probed in the


laboratory. Under restrictions on the set of Pauli measurements sampled in the laboratory, our algorithm outperforms the previous methods using classical shadows. On the other hand, the


restriction to short-range local measurements implies that the applicability of our method is limited to many-body quantum states with a constant correlation length, such as ground states


within an SPT phase. A crucial aspect of our neural network model is its ability to generate a latent state representation that integrates different pieces of information, corresponding to


multiple physical properties. Remarkably, the state representations appear to capture information about properties beyond those encountered in training. This feature allows for unsupervised


classification of phases of matter, applicable not only to in-distribution Hamiltonian ground states but also to out-of-distribution quantum states, like those produced by random circuits.


The model also appears to be able to generalize from smaller to larger quantum systems, which makes it an effective tool for exploring intermediate-scale quantum systems. For new quantum


systems, whose true phase diagrams is still unknown, discovering phase diagrams in an unsupervised manner is a major challenge. This challenge can potentially be addressed by combining our


neural network with consistency-checking, similar to the approach in ref. 18. The idea is to start with an initial, potentially inaccurate, phase diagram ansatz constructed from limited


prior knowledge, for instance, the results of clustering. Then, one can randomly select a set of reference states, labeling them according to the ansatz phases. Based on these labels, a


separate neural network is trained to predict phases. Finally, the ansatz can be revised based on the deviation with the network’s prediction, and the procedure can be iterated until it


converges to a stable ansatz. In Supplementary Note 7, we provide examples of this approach, leaving the development of a full algorithm for autonomous discovery of phase diagram as future


work. METHODS DATA GENERATION Here we illustrate the procedures for generating training and test datasets. For the one-dimensional cluster-Ising model, we obtain measurement statistics and


values for various properties in both the training and test datasets through direct calculations, leveraging the ground states solved by exact algorithms. In the case of the one-dimensional


bond-alternating XXZ model, we first obtain approximate ground states represented by matrix product states48,49 using the density-matrix renormalization group (DMRG)50 algorithm.


Subsequently, we compute the measurement statistics and properties by contracting the tensor networks. For the noisy measurement statistics because of finite sampling, we generate them by


sampling from the actual probability distribution of measurement outcomes. More details are provided in Supplementary Note 1. REPRESENTATION NETWORK The representation network operates on


pairs of measurement outcome distributions and the parameterization of their corresponding measurements, denoted as \({({{{\bf{d}}}}_{i},\,{{{\bf{m}}}}_{i})}_{i=1}^{m}\) associated with a


state _ρ_. This network primarily consists of three multilayer perceptrons (MLPs)51. The first MLP comprises a four-layer architecture that transforms the measurement outcome distribution


into \({{{\bf{h}}}}_{i}^{d}\), whereas the second two-layer MLP maps the corresponding M_i_ to \({{{\bf{h}}}}_{i}^{m}\): $${{{\bf{h}}}}_{i}^{d}=\, {{{\rm{MLP}}}}_{1}({{{\bf{d}}}}_{i}),\\


{{{\bf{h}}}}_{i}^{m}=\, {{{\rm{MLP}}}}_{2}({{{\bf{m}}}}_{i}).$$ Next, we merge \({{{\bf{h}}}}_{i}^{d}\) and \({{{\bf{h}}}}_{i}^{m}\), feeding them into another three-layer MLP to obtain a


partial representation denoted as R_i_ for the state: $${{{\bf{r}}}}_{\rho }^{(i)}={{{\rm{MLP}}}}_{3}\left(\left[{{{\bf{h}}}}_{i}^{d},{{{\bf{h}}}}_{i}^{m}\right]\right).$$ (5) Following


this, we aggregate all the R_i_ representations through an average pooling layer to produce the complete state representation, denoted as R_ρ_: $${{{\bf{r}}}}_{\rho }=\frac{1}{s}{\sum


}_{i=1}^{s}{{{\bf{r}}}}_{i}.$$ (6) Alternatively, we can leverage a recurrent neural network equipped with gated recurrent units (GRUs)52 to derive the comprehensive state representation


from the set \({\{{{{\bf{r}}}}_{i}\}}_{i=1}^{m}\): $${{{\bf{z}}}}_{i}=\, {{\rm{sigmoid}}}\left({W}_{z}{{{\bf{r}}}}_{\rho }^{(i)}+{U}_{z}{{{\bf{r}}}}_{\rho


}^{(i-1)}+{{{\bf{b}}}}_{{{\bf{z}}}}\right),\\ {\hat{{{\bf{h}}}}}_{i}=\, \tanh \left({W}_{h}{{{\bf{r}}}}_{\rho }^{(i)}+{U}_{h}({{{\bf{z}}}}_{i}\odot


{{{\bf{h}}}}_{i-1})+{{{\bf{b}}}}_{{{\bf{h}}}}\right),\\ {{{\bf{h}}}}_{i}=\, (1-{{{\bf{z}}}}_{i})\odot {{{\bf{h}}}}_{i-1}+{{{\bf{z}}}}_{i}\odot {\hat{{{\bf{h}}}}}_{i},\\ {{{\bf{r}}}}_{\rho


}=\, {{{\bf{h}}}}_{m},$$ where _W_, _U_, B are trainable matrices and vectors. The architecture of the recurrent neural network offers a more flexible approach to generate the complete state


representation; however, in our experiments, we did not observe significant advantages compared to the average pooling layer. RELIABILITY OF REPRESENTATIONS The neural network can assess


the reliability of each state representation by conducting contrastive analysis within the representation space. Figure 10 shows a measure of the reliability of each state representation,


which falls in the region [0, 1], for both the cluster-Ising model and the bond-alternating XXZ model. As this measure increases from 0 to 1, the reliability of the corresponding prediction


strengthens, with values closer to 0 indicating low reliability and values closer to 1 indicating high reliability. Figure 10a indicates that the neural network exhibits lower confidence for


the ground states in SPT phase than those in the other two phases, with the lowest confidence occurring near the phase boundaries. Figure 10b shows that the reliability of predictions for


the ground states of the XXZ model in two SPT phases are higher than those in the symmetry broken phase, which is due to the imbalance of training data, and that the predictions for quantum


states near the phase boundaries have the lowest reliability. Here, the reliability is associated with the distance between the state representation and its cluster center in the


representation space. We adopt this definition based on the intuition that for a quantum state that the model should exhibit higher confidence for quantum states that cluster more easily.


Distance-based methods53,54 have proven effective in the task of Out-of-Distribution detection in classical machine learning. This task focuses on identifying instances that significantly


deviate from the data distribution observed during training, thereby potentially compromising the reliability of the trained neural network. Motivated by this line of research, we present a


contrastive methodology for assessing the reliability of representations produced by the proposed neural model. Denote the set of representations corresponding quantum states as


\(\{{{{\bf{r}}}}_{{\rho }_{1}},{{{\bf{r}}}}_{{\rho }_{2}},\cdots \,,{{{\bf{r}}}}_{{\rho }_{n}}\}\). We leverage reachability distances, \({\{{d}_{{\rho }_{j}}\}}_{j=1}^{n}\), derived from


the OPTICS (Ordering Points To Identify the Clustering Structure) clustering algorithm55 to evaluate the reliability of representations, denoted as \({\{r{v}_{{\rho }_{j}}\}}_{j=1}^{n}\):


$${\{{d}_{{\rho }_{j}}\}}_{j=1}^{n}=\, {{\rm{OPTICS}}}\left({\{\phi ({{{\bf{r}}}}_{{\rho }_{j}})\}}_{j=1}^{n}\right),\\ r{v}_{{\rho }_{j}}=\, \frac{\exp (-{d}_{{\rho }_{k}})} {{{\max


}_{k=1}^{n} \exp (-{d}_{{\rho }_{k}})}},$$ where _ϕ_ is a feature encoder. In the OPTICS clustering algorithm, a smaller reachability distance indicates that the associated point lies closer


to the center of its corresponding cluster, thereby facilitating its clustering process. Intuitively, a higher density within a specific region of the representation space indicates that


the trained neural model has had more opportunities to gather information from that area, thus enhancing its reliability. Our proposed method is supported by similar concepts introduced in


ref. 54. More details are provided in Supplementary Note 3. PREDICTION NETWORK For each type of property associated with the state, we employ a dedicated prediction network responsible for


making predictions. Each prediction network is composed of three MLPs. The first MLP takes the state representation R_ρ_ as input and transforms it into a feature vector HR while the second


takes the query task index _q_ as input and transforms it into a feature vector H_q_. The second MLP operates on the combined feature vectors [HR, H_q_] to produce the prediction _f__q_(_ρ_)


for the property under consideration: $${{{\bf{h}}}}^{{{\bf{r}}}}=\, {{{\rm{MLP}}}}_{4}({{{\bf{r}}}}_{\rho }),\\ {{{\bf{h}}}}^{q}=\, {{{\rm{MLP}}}}_{5}(q),\\ {f}_{q}(\rho )=\,


{{{\rm{MLP}}}}_{6}([{{{\bf{h}}}}^{{{\bf{r}}}},\,{{{\bf{h}}}}^{q}]).$$ NETWORK TRAINING We employ the stochastic gradient descent56 optimization algorithm and the Adam optimizer57 to train


our neural network. In our training procedure, for each state within the training dataset, we jointly train both the representation network and the prediction networks associated with one or


two types of properties available for that specific state. The training loss is the cumulative sum of losses across different states and properties. This training is achieved by minimizing


the difference between the predicted values generated by the network and the ground-truth values, thus refining the model’s ability to capture and reproduce the desired property


characteristics. The detailed pseudocode for the training process can be found in Supplementary Note 2. NETWORK TEST & TRANSFER LEARNING After the training is concluded, the multi-task


networks are fixed. To evaluate the performance of the trained model, we perform a series of tests on a separate dataset that includes states not seen during training. This evaluation helps


in assessing the model’s ability to generalize to new data. To achieve transfer learning for new tasks using state representations produced by the representation network, we first fix the


representation network and obtain the state representations. We then introduce a new prediction network that takes these state representations as input, allowing us to leverage the


pre-trained representations to predict new properties. During the training of this new task, we use the Adam optimizer57 and stochastic gradient descent56 to minimize the prediction error.


Once the training is complete, we fix this new prediction network and test its performance on previously unseen states to evaluate its generalization capability. HARDWARE We employ the


PyTorch framework58 to construct the multi-task neural networks in all our experiments and train them with two NVIDIA GeForce GTX 1080 Ti GPUs. DATA AVAILABILITY Data sets generated during


the current study are available in https://github.com/yzhuqici/learn_quantum_properties_from_local_correlation. CODE AVAILABILITY The codes that support the findings of this study are


available in https://github.com/yzhuqici/learn_quantum_properties_from_local_correlation. REFERENCES * Torlai, G. et al. Neural-network quantum state tomography. _Nat. Phys._ 14, 447–450


(2018). Article  CAS  Google Scholar  * Carrasquilla, J., Torlai, G., Melko, R. G. & Aolita, L. Reconstructing quantum states with generative models. _Nat. Mach. Intell._ 1, 155–161


(2019). Article  Google Scholar  * Zhu, Y. et al. Flexible learning of quantum states with generative query neural networks. _Nat. Commun._ 13, 6222 (2022). Article  ADS  PubMed  PubMed


Central  CAS  Google Scholar  * Schmale, T., Reh, M. & Gärttner, M. Efficient quantum state tomography with convolutional neural networks. _NPJ Quantum Inf._ 8, 115 (2022). Article  ADS


  Google Scholar  * Carleo, G. & Troyer, M. Solving the quantum many-body problem with artificial neural networks. _Science_ 355, 602–606 (2017). Article  ADS  MathSciNet  PubMed  CAS 


Google Scholar  * Zhang, X. et al. Direct fidelity estimation of quantum states using machine learning. _Phys. Rev. Lett._ 127, 130503 (2021). Article  ADS  PubMed  CAS  Google Scholar  *


Xiao, T., Huang, J., Li, H., Fan, J. & Zeng, G. Intelligent certification for quantum simulators via machine learning. _NPJ Quantum Inf._ 8, 138 (2022). Article  ADS  CAS  Google Scholar


  * Du, Y. et al. Shadownet for data-centric quantum system learning. _arXiv preprint arXiv:2308.11290_ (2023). * Wu, Y.-D., Zhu, Y., Bai, G., Wang, Y. & Chiribella, G. Quantum


similarity testing with convolutional neural networks. _Phys. Rev. Lett._ 130, 210601 (2023). Article  ADS  PubMed  CAS  Google Scholar  * Qian, Y., Du, Y., He, Z., Hsieh, M.-H. & Tao,


D. Multimodal deep representation learning for quantum cross-platform verification. _Phys. Rev. Lett._ 133, 130601 (2024). * Gao, J. et al. Experimental machine learning of quantum states.


_Phys. Rev. Lett._ 120, 240501 (2018). Article  ADS  PubMed  CAS  Google Scholar  * Gray, J., Banchi, L., Bayat, A. & Bose, S. Machine-learning-assisted many-body entanglement


measurement. _Phys. Rev. Lett._ 121, 150503 (2018). Article  ADS  PubMed  CAS  Google Scholar  * Koutnỳ, D. et al. Deep learning of quantum entanglement from incomplete measurements. _Sci.


Adv._ 9, eadd7131 (2023). Article  PubMed  PubMed Central  Google Scholar  * Torlai, G. et al. Integrating neural networks with a quantum simulator for state reconstruction. _Phys. Rev.


Lett._ 123, 230504 (2019). Article  ADS  PubMed  CAS  Google Scholar  * Huang, Y. et al. Measuring quantum entanglement from local information by machine learning. _arXiv preprint


arXiv:2209.08501_ (2022). * Smith, A. W. R., Gray, J. & Kim, M. S. Efficient quantum state sample tomography with basis-dependent neural networks. _PRX Quantum_ 2, 020348 (2021). Article


  ADS  Google Scholar  * Carrasquilla, J. & Melko, R. G. Machine learning phases of matter. _Nat. Phys._ 13, 431–434 (2017). Article  CAS  Google Scholar  * Van Nieuwenburg, E. P., Liu,


Y.-H. & Huber, S. D. Learning phase transitions by confusion. _Nat. Phys._ 13, 435–439 (2017). Article  Google Scholar  * Huembeli, P., Dauphin, A. & Wittek, P. Identifying quantum


phase transitions with adversarial neural networks. _Phys. Rev. B_ 97, 134109 (2018). Article  ADS  CAS  Google Scholar  * Rem, B. S. et al. Identifying quantum phase transitions using


artificial neural networks on experimental data. _Nat. Phys._ 15, 917–920 (2019). Article  CAS  Google Scholar  * Kottmann, K., Huembeli, P., Lewenstein, M. & Acín, A. Unsupervised phase


discovery with deep anomaly detection. _Phys. Rev. Lett._ 125, 170603 (2020). Article  ADS  MathSciNet  PubMed  CAS  Google Scholar  * Huang, H.-Y., Kueng, R. & Preskill, J. Predicting


many properties of a quantum system from very few measurements. _Nat. Phys._ 16, 1050–1057 (2020). Article  CAS  Google Scholar  * Elben, A. et al. Cross-platform verification of


intermediate scale quantum devices. _Phys. Rev. Lett._ 124, 010504 (2020). Article  ADS  PubMed  CAS  Google Scholar  * Elben, A. et al. Many-body topological invariants from randomized


measurements in synthetic quantum matter. _Sci. Adv._ 6, eaaz3666 (2020). Article  ADS  PubMed  PubMed Central  Google Scholar  * Huang, H.-Y. Learning quantum states from their classical


shadows. _Nat. Rev. Phys._ 4, 81–81 (2022). Article  Google Scholar  * Huang, H.-Y., Kueng, R., Torlai, G., Albert, V. V. & Preskill, J. Provably efficient machine learning for quantum


many-body problems. _Science_ 377, eabk3333 (2022). Article  MathSciNet  PubMed  CAS  Google Scholar  * Elben, A. et al. The randomized measurement toolbox. _Nat. Rev. Phys._ 5, 9–24 (2023).


Article  Google Scholar  * Zhao, H. et al. Learning quantum states and unitaries of bounded gate complexity. _arXiv preprint arXiv:2310.19882_ (2023). * Hu, H.-Y., Choi, S. & You, Y.-Z.


Classical shadow tomography with locally scrambled quantum dynamics. _Phys. Rev. Res._ 5, 023027 (2023). Article  CAS  Google Scholar  * Hu, H.-Y. et al. Demonstration of robust and


efficient quantum property learning with shallow shadows. _arXiv preprint arXiv:2402.17911_ (2024). * Cramer, M. et al. Efficient quantum state tomography. _Nat. Commun._ 1, 149 (2010).


Article  ADS  PubMed  Google Scholar  * Baumgratz, T., Nüßeler, A., Cramer, M. & Plenio, M. B. A scalable maximum likelihood method for quantum state tomography. _N. J. Phys._ 15, 125004


(2013). Article  Google Scholar  * Lanyon, B. et al. Efficient tomography of a quantum many-body system. _Nat. Phys._ 13, 1158–1162 (2017). Article  CAS  Google Scholar  * Kurmapu, M. K. et


al. Reconstructing complex states of a 20-qubit quantum simulator. _PRX Quantum_ 4, 040345 (2023). Article  ADS  Google Scholar  * Guo, Y. & Yang, S. Quantum state tomography with


locally purified density operators and local measurements. _Commun. Phys._ 7, 322 (2024). * Friis, N. et al. Observation of entangled states of a fully controlled 20-qubit system. _Phys.


Rev. X_ 8, 021012 (2018). CAS  Google Scholar  * Joshi, M. K. et al. Exploring large-scale entanglement in quantum simulation. _Nature_ 624, 539–544 (2023). Article  ADS  PubMed  CAS  Google


Scholar  * Zhang, Y. & Yang, Q. A survey on multi-task learning. _IEEE Trans. Knowl. Data Eng._ 34, 5586–5609 (2021). Article  Google Scholar  * Klyachko, A. A. et al. Quantum marginal


problem and n-representability. _J. Physi.: Conf. Series_, 36, 72 (IOP Publishing, 2006). * Christandl, M. & Mitchison, G. The spectra of quantum states and the kronecker coefficients of


the symmetric group. _Commun. Math. Phys._ 261, 789–797 (2006). Article  ADS  MathSciNet  Google Scholar  * Schilling, C. et al. _The Quantum Marginal Problem._ In _Mathematical Results in


Quantum Mechanics: Proceedings of the QMath12 Conference_, 165–176 (World Scientific, 2015). * Pollmann, F. & Turner, A. M. Detection of symmetry-protected topological phases in one


dimension. _Phys. Rev. B_ 86, 125441 (2012). Article  ADS  Google Scholar  * Smacchia, P. et al. Statistical mechanics of the cluster Ising model. _Phys. Rev. A_ 84, 022304 (2011). Article 


ADS  Google Scholar  * Cong, I., Choi, S. & Lukin, M. D. Quantum convolutional neural networks. _Nat. Phys._ 15, 1273–1278 (2019). Article  CAS  Google Scholar  * Herrmann, J. et al.


Realizing quantum convolutional neural networks on a superconducting quantum processor to recognize quantum phases. _Nat. Commun._ 13, 4144 (2022). Article  ADS  PubMed  PubMed Central 


Google Scholar  * Cohen, P. R. & Howe, A. E. How evaluation guides ai research: the message still counts more than the medium. _AI Mag._ 9, 35–35 (1988). Google Scholar  * Liu, Y.-J.,


Smith, A., Knap, M. & Pollmann, F. Model-independent learning of quantum phases of matter with quantum convolutional neural networks. _Phys. Rev. Lett._ 130, 220603 (2023). Article  ADS


  MathSciNet  PubMed  CAS  Google Scholar  * Fannes, M., Nachtergaele, B. & Werner, R. F. Finitely correlated states on quantum spin chains. _Commun. Math. Phys._ 144, 443–490 (1992).


Article  ADS  MathSciNet  Google Scholar  * Perez-García, D., Verstraete, F., Wolf, M. M. & Cirac, J. I. Matrix product state representations. _Quantum Inf. Comput._ 7, 401–430 (2007).


MathSciNet  Google Scholar  * Schollwöck, U. The density-matrix renormalization group. _Rev. Mod. Phys._ 77, 259 (2005). Article  ADS  MathSciNet  Google Scholar  * Gardner, M. W. &


Dorling, S. Artificial neural networks (the multilayer perceptron)—a review of applications in the atmospheric sciences. _Atmos. Environ._ 32, 2627–2636 (1998). Article  ADS  CAS  Google


Scholar  * Chung, J., Gulcehre, C., Cho, K. & Bengio, Y. Empirical evaluation of gated recurrent neural networks on sequence modeling. In _NIPS 2014 Workshop on Deep Learning, December


2014_ (2014). * Lee, K., Lee, K., Lee, H. & Shin, J. A simple unified framework for detecting out-of-distribution samples and adversarial attacks. _Advances in neural information


processing systems_ 31, (2018). * Sun, Y., Ming, Y., Zhu, X. & Li, Y. Out-of-distribution detection with deep nearest neighbors. In _International Conference on Machine Learning_,


20827–20840 (PMLR, 2022). * Ankerst, M., Breunig, M. M., Kriegel, H.-P. & Sander, J. Optics: Ordering points to identify the clustering structure. _ACM Sigmod Rec._ 28, 49–60 (1999).


Article  Google Scholar  * Bottou, L. et al. Stochastic gradient descent tricks. In _Neural Networks: Tricks of the Trade: Second Edition_, 421–436 (Springer, 2012). * Kingma, D. P. &


Ba, J. Adam: A method for stochastic optimization. _arXiv preprint arXiv:1412.6980_ (2014). * Paszke, A. et al. Pytorch: An imperative style, high-performance deep learning library.


_Advances in neural information processing systems_32 (2019). * Williamson, D. F., Parker, R. A. & Kendrick, J. S. The box plot: a simple visual method to interpret data. _Ann. Intern.


Med._ 110, 916–921 (1989). Article  PubMed  CAS  Google Scholar  Download references ACKNOWLEDGEMENTS We thank Ge Bai, Dong-Sheng Wang, Shuo Yang, Yuchen Guo and Jiehang Zhang for the


helpful discussions on many-body quantum systems. This work was supported by funding from the Hong Kong Research Grant Council through grants no. 17300918 and no. 17307520 (GC), through the


Senior Research Fellowship Scheme SRFS2021-7S02 (GC), the Chinese Ministry of Science and Education through grant 2023ZD0300600 (GC), and the John Templeton Foundation through grant 62312


(GC), The Quantum Information Structure of Spacetime (qiss.fr). YDW acknowledges funding from the National Natural Science Foundation of China through grants no. 12405022. YXW acknowledges


funding from the National Natural Science Foundation of China through grants no. 61872318. Research at the Perimeter Institute is supported by the Government of Canada through the Department


of Innovation, Science and Economic Development Canada and by the Province of Ontario through the Ministry of Research, Innovation and Science. The opinions expressed in this publication


are those of the authors and do not necessarily reflect the views of the John Templeton Foundation. AUTHOR INFORMATION Author notes * These authors contributed equally: Ya-Dong Wu, Yan Zhu.


AUTHORS AND AFFILIATIONS * John Hopcroft Center for Computer Science, Shanghai Jiao Tong University, Shanghai, China Ya-Dong Wu * QICI Quantum Information and Computation Initiative,


Department of Computer Science, The University of Hong Kong, Pokfulam Road, Hong Kong, Hong Kong Ya-Dong Wu, Yan Zhu & Giulio Chiribella * AI Technology Lab, Department of Computer


Science, The University of Hong Kong, Pokfulam Road, Hong Kong, Hong Kong Yuexuan Wang * College of Computer Science and Technology, Zhejiang University, Hangzhou, Zhejiang Province, China


Yuexuan Wang * Department of Computer Science, Parks Road, Oxford, United Kingdom Giulio Chiribella * Perimeter Institute for Theoretical Physics, Waterloo, Ontario, Canada Giulio Chiribella


Authors * Ya-Dong Wu View author publications You can also search for this author inPubMed Google Scholar * Yan Zhu View author publications You can also search for this author inPubMed 


Google Scholar * Yuexuan Wang View author publications You can also search for this author inPubMed Google Scholar * Giulio Chiribella View author publications You can also search for this


author inPubMed Google Scholar CONTRIBUTIONS Y.-D.W. and Y.Z. developed the key idea for this paper. Y.Z. conducted the numerical experiments while G.C. and Y.W. contributed to their design.


All coauthors contributed to the writing of the manuscript. CORRESPONDING AUTHORS Correspondence to Yan Zhu or Giulio Chiribella. ETHICS DECLARATIONS COMPETING INTERESTS The authors declare


no competing interests. PEER REVIEW PEER REVIEW INFORMATION _Nature Communications_ thanks Alistair Smith, and the other, anonymous, reviewers for their contribution to the peer review of


this work. A peer review file is available. ADDITIONAL INFORMATION PUBLISHER’S NOTE Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional


affiliations. SUPPLEMENTARY INFORMATION SUPPLEMENTARY INFORMATION PEER REVIEW FILE RIGHTS AND PERMISSIONS OPEN ACCESS This article is licensed under a Creative Commons


Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give


appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission


under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons


licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by


statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit


http://creativecommons.org/licenses/by-nc-nd/4.0/. Reprints and permissions ABOUT THIS ARTICLE CITE THIS ARTICLE Wu, YD., Zhu, Y., Wang, Y. _et al._ Learning quantum properties from


short-range correlations using multi-task networks. _Nat Commun_ 15, 8796 (2024). https://doi.org/10.1038/s41467-024-53101-y Download citation * Received: 07 November 2023 * Accepted: 30


September 2024 * Published: 11 October 2024 * DOI: https://doi.org/10.1038/s41467-024-53101-y SHARE THIS ARTICLE Anyone you share the following link with will be able to read this content:


Get shareable link Sorry, a shareable link is not currently available for this article. Copy to clipboard Provided by the Springer Nature SharedIt content-sharing initiative