Axioms of Quantum Mechanics
Quantum Mechanics is premised on a series of axioms from which we can characterize the theory. The first 3 of these relate directly to the nature of quantum states.

States
A quantum state is a ray in a Hilbert Space . The signifance of this definition is that quantum states are described as normalised, phaseequivalent vectors in a complex vector space. Relative phases only become significant when we are constructing states from linear superpositions of other states. 
Observables
An observable is a property of a physical system that can be measured, and is described by a selfadjoint operator
The significance of the selfadjoint requirement is that the eigenvalues of such an operator are real valued. Operators acting on quantum states are not necessarily selfadjoint, but must be to correspond to an observable. 
Measurement
A measurement is a process in which we acquire information about the state of the physical system. This corresponds to obtaining an eigenvalue of the observable , and preparing the state in the corresponding eigenstate.
Given the 'amplitude' and projeciton onto the eigenstate of outcome , , we have probability and resulting state .
The expectation value of a measurement . 
Dynamics
The evolution of the quantum state is descripted by a Unitary operator. This can be understood by considering the SchrÃ¶diner eqution, which tells us that Rewriting this for short timescales, we can write this to first order as and this 'evolution' operator is unitary () as the Hamiltonian is selfadjoint. 
Composition
Given two systems A and B with associated hilbert spaces , their composite system is the tensor product
This is highly distinct from classical mechanics, where vector spaces combine under the cartesian product. The tensor product is defiend further in the link above.
It is worth noting a few aspects of these axioms. Firstly, we can see that it is linear; this property underlies a lot of 'quantum' behaviour such as the nocloning theorem and the Holevo bound. We will also note that there is a fundamental tension here between measurement, which is probabilisitic and instaneously updates a state, and the more deterministic evolution.
Qubits
The smallest nontrivial Hilbert Space is two dimensional. These twolevel systems are the building blocks of quantum information, and are called qubits in analogy to the classical bit . The primary basis of the qubit is called the computational basis, , but the system can exist in any linear superposition . These absolutevalues of (a,b) represent the probabilities of obtaining a 0 or 1 in a computational basis measurement. The global phase of the state doesn't matter, but the relative phase between the two is physically significant.
Symmetry transforms in QM
A symmetry transform is defined as one which leaves the observables of the system unchanged. In QM, this means that the eigenstates of the state must be left unchanged. In particular, given a transform must preserve inner products of the form
A theorem by Wigner shows that a mapping of this form can always be chosen to be either unitary or antiunitary. For continuous symmetries, we are restricted to the (linear) unitary case.
Symmetries, either cotinuous or discrete, can be joined in to 'groups'; they are always invertable, the combination of two symmetries in the same group is also a symmetry, etc. For each symmetry operation 'R' we have a unitary operator , and in particular the combination is described the the unitary where is a global phase allowed as quantum states are rays (see axiom 1).
We can encode the requirement that our symmetry is also unaffected by the dynamics of the system by requiring that the operators have an obersvable that commutes with the Hamiltonian.
For continuous symmetries, we can consider infinitesimal trasnformations , with an associated operator close to the identity
We can build up finite transforms then from these infinitesimal ones in the limit .
For more of a discussion, see the (currently nonexistent) group theory notes. Just to remark here that qubit symmetries are continuous symmetries generated by the Pauli matrices.
The Density Matrix Representation
When we extend to open, multipartite quantum systems, the axioms of quantum mechanics are made to 'bend' somewhat; this is to do with the inability to access the full physical state of the 'environment'. In this case
 States are no longer rays
 Measurements are not orthogonal projectors.
 Evolution is not unitary.
The modified form of these axioms can be determined using the 5th axiom, composition, as a guide. The full composite system obeys the axioms outlined above, but the individual subsystems do not. We can understand how to describe outcomes on A alone, by considering the matrix element of operators like .
This leads to the density matrix formalism. The density matrix of a state is given by the outer product of a state with itself
The state that describes just the subsystem , is given by the partial trace of the density matrix over the other subsytem .
This representation is called the density operator, and it satisfies several criteria 1. Selfadjoint: 2. Positive: 3. Normalized:
This tells us that this matrix is diagonalisable in an orthonormal basis, with real and nonnegative eigenvalues that sum to 1.
Pure and Mixed States
If the state of a subsystem is a ray, then we say that state is 'pure'. If not, the state is said to be 'mixed'. We can test purity using that observation that, if the state is pure, then amounts to a projector on to the state, and thus it has the propety that . A general desnity matrix has the form , and thus , with this inequality saturated if and only if the state is pure.
A mixed state is an incoherent mixture of the states which form a basis of the Hilbert space . This also gives us a way to use the density matrix as an 'entanglement witness'; entangled states are globally pure, but their subsystems are incoherent superpositions of states. If the subsystem is pure, we know that the two systems cannot be entangled. However, the reverse is not necessarily true.
Bloch Vector representations
The Bloch sphere is a way of representing qubit states, based on an isomporhism between the 2d space of complex numbers and . This is due to the similarity of the generators for unitaries, and the generators for rotations in
The pauli matrices also serve as a 'matrix basis' for , where we can characterise our density matrix as where is a normalised vector in .
Schmidt Decomposition
The Schmidt decomosition is a technique for decomposing a composite system into a mutually orthogonal basis where are the eigenvalues of the density matrix, and the number of terms in the sum is called the 'Schmidt Rank'. A Schmidt Rank of more than 1 is a witness of entanglement.
The partial trace for each subsystem is the easy to obtain from this form
A general decomposition of a bipartite state is related to its Schmidt basis by unitary transforms on the bases in and . In this case, these unitaries map the general matrix of coefficients on , to a diagonal, nonnegative matrix .
If have no degenerate eigenvalues other than 0, then the Schmidt decomposition of is determined by the eigenvalues of either subsystem. f this is not the case, then we need more information to determine the Schmidt basis. In particular, given , then gives the partial trace of either system as the maximally mixed state, meaning we have no information about the state of the subsystems.
We can apply arbitrary simultaneous unitary transforms, and these will leave the state unchanged, which suggests an ambiguity in the basis used for the Schmidt decomposition.
Ambiguity of the Ensemble
Given the properties of the density matrix described above, we can construct a valid density matrix from a convex conbimation of other density matrices
It is not possible to express pure states as a convex sum of other density matrices in this way; they form the extremal points of a conex subset of operators on a Hilbert space For , all the extremal points are pure, though this is nolonger the case for .
The convexity of denisty matrices means that, if we prepare a convex mixture , the measurement outcomes are indistinguishable from if we had just preapred directly; this is called the ambiguity of the ensemble.
We can quantify this 'ambiguity' by considering how many ways a given point in the set can be expressed as a convex sum of extremal states. E.g. in the qubit case, a point in the Bloch sphere has a decomposition into any pair of points that form a chord in the sphere that passes through the point This is distinct from the classical case where a probability distribution with states has a unique decomposition.
Norms, Distance Measures and Uhlman's Theorem
Given a pair of density operators, we can ask how 'similar' are the two states they describe? How close are they in Hilbert space? We can use any one of a family of matrix pnorms, defined by
This leads to the definition of the 'fidelity' of twostates, equal to the deviation from 1 of th overlap of the two states. In terms of the 1norm, we define , and the fidelity is given by the square of the 1norm of A.
All norms in this family satisfy three properties: 1. Triangle Inequality: 2. Submultiplicativity: 3. Monotonicity:
A related result to this is Uhlman's theorem, which states that the fidelity between two states is equal to the maximum possible overlap of their 'purifications'. A purification of a state is a pure state The general form of a purification can be given by where is the maximally entangled state and is a unitary operator.
Using purifications, Ulhamn's theorem states that
A corollary of this is that fidelity is also monotonic :
Trace Distance
An alternative measure, called the trace distance, is inspired by a similar measure for the distance between classical probability distributions called the or Kolmogorov distance. For two classical distributions , the trace distance is defined as
This metric satisfies the norm criteria above, and is analagous to the Schatten 1norm for It also ha a clear geometric interpretation: for subsets the trace distane can be shown to be equal to
For quantum states, we can define our measure as which reduces to the classical formula in the event that For qubits, this distance actually amounts to half the Euclidian distance between their points on the Bloch sphere.
The trace distance is invariant under local Unitary operations. We can also generalise the trace distance to an analagous maximisation problem where the maximisation is taken over all projectors P, or alternatively over all positive operators This gives a neat interpretation of the distance as maximising the separation over all possible POVMs that could be used to distinguish the two states. Based on this interpretation, we can rewrite as a maximisation over all POVMs
We note that, under the action of CPTP map , the tracedistance is monotonically decreasing Using the fact that the partial trace is also a CPTP map, this shows us that the tracedistance between two subsystems is less that or equal to that between the composite systems. As a greater trace distance corresponds to greater distinguishability, this is the reasonable interpretation that we can't 'gain' information by discarding part of the system.
This is related to the strong convexity of the trace distance. Given two probabilitiy distributions , then
For more details on distance measures see chapter 9 of Nielsen & Chuang.