After theta(x0) has been computed we have Q(x0) is a subfield (identify x0 and theta(x0) here) of Q(x,y) with index 2. If y is not an element of Q(x0) then take y0=y, otherwise take y0=x. Now y0 is of degree 2 over Q(x0), but the algebraic relation between x0 and y0 is not yet of the form y0^2=polynomial in x0 of degree 3. Compute the relation between y0 and x0 as follows (here we assume y0=y, but the case y0=x can be done in the same way): substitute(y=y0, Res_x( f , numerator( x0 - theta(x0) )) ) Here x0 is used as a variable and theta(x0) is now written as element of Q(x,y). Now add an element of Q(x0) to y0 such that the relation becomes: y0^2=rational function in x0. Then multiply y0 by a suitable element of Q(x0) such that the relation becomes y0^2=polynomial in x0 of degree 3.
This approach didn't give a speedup in the test examples that I used, but the difference in speed was small. Maybe this approach would be better if I'd used the "first reduce the number of variables by a substitution before doing a resultant computation" trick.
Note that this approach gives rise to a different method of computing theta^{-1}(y). Because we start with y0=y and keep adapting y0 until we get y0^2=polynomial in x0 of degree 3. This gives a method for expressing y as a polynomial in y0 and x0 by keeping track of the changes that are made to y0 to obtain the form y0^2=polynomial in x0 of degree 3. The method works unless theta^{-1}(y) is an element of Q(x0), but one can write a different method to treat this special case. In the same way theta^{-1}(x) can be computed.
Computation of theta^{-1}(x)
Abbreviate theta^{-1}(x) as xv. In the generic case xv is algebraic of degree 2 over Q(x0) and algebraic of degree 3 over Q(y0). So the two minimum polynomials m2 and m3 have degree 2 and 3 in xv. Compute remainder(m3,m2) as a polynomial in xv. This remainder will be a polynomial in Q(x0,y0)[xv] of degree 1 in xv. Solving it gives xv. However, this turns out to be slow. Solving this 1'st degree equation comes down to a division of two large elements of Q(x0,y0) which costs a lot of time. The current version of IntBasis doesn't compute m3 anymore. Instead it computes two roots of m2 in Q(x0,y0) and checks which of these two is xv.